QuickSyllabus

医学AIトレーニングⅠ Medical AI training 1

決定次第、グループウェアより通知 Indicated dates informed by the Groupware. 単位数/Credit(s): 3. 授業代表教員/Director: 田宮元. 開講期間/Term of Classes: 令和6年10月～令和7年3月 2024/Oct – 2025/Mar. 履修年度: 2024. 科目ナンバリング/Course Numbering: -J. 使用言語/Language Used in Course: 日本語.

科目名/Subject

医学AIトレーニングⅠ

授業題目/Class subject

医学AIトレーニングⅠ
Medical AI training 1

授業担当教員/Lecturer

岩崎淳也講師、宮内誠カルロス助教、高屋英知助手、高山順准教授、城田松之講師他数名
Drs. Junya Iwazaki and others

教室/Classroom

決定次第、グループウェアより通知
Indicated dates informed by the Groupware

授業の目的と概要/Object and Summary of Class

本実習では東北メディカル・メガバンク機構(ToMMo)のスーパーコンピュータを用いてビッグデータ解析の手法を学習する。
近年の技術革新によってゲノムをはじめとする医学研究に用いられるデータはますます大規模となりつつあり、そのようなビッグデータの解析には専用のスーパーコンピュータが必要となってきている。ToMMoのスパコンはこのような個人レベルのビッグデータを安全に解析することのできる計算機環境である。参加者はスパコンへのアカウント作成、アクセス、データアップロード、データ解析、データダウンロードを行う。本実習ではToMMoのコホートデータは扱わないが、個人レベルの情報への注意点について学習する。データ解析についてはジョブスケジューリングシステムであるSLURMを用いた大規模実行を行う。この実習により、医学データやToMMoのコホートデータの解析の基礎を学ぶ。更に、応用例としては、医療データセットを用いて実際の臨床現場で使用されている課題の解決を図る。
This course aims to learn big data analysis by using the supercomputer system of the Tohoku Medical Megabank Organization (ToMMo).
Due to recent advancement of technologies, the size of data used in medical researches, e.g. genome data, has been increasing in size, and it is necessary to use specialized supercomputers to analyze such big data. The supercomputer system of ToMMo is suitable for analyzing individual-level big data in a secure environment. Students will learn to create an account of the supercomputer, access to it, upload data to it, analyze data with it and download data from it. Though the cohort data of ToMMo will not be used in this course, students will learn secure treatment of individual-level data in the supercomputer. To run a large-scale data analysis, job-scheduling system, SLURM, will be used. Students will learn the basics of large-scale medical data analysis through this course. Students can also use the supercomputer for addressing the problems in clinics by using their own medical data sets.

学修の到達目標/Goal of study

スパコンの構成と扱うことができるデータについて理解する。スパコンにアクセスして解析を行うことができる。インターネット環境からスパコンに必要なデータを導入し、解析環境の構築をすることができる。
Linuxの基本的なコマンドを使いこなすことができる。Linux上でエディタを用いてプログラムやデータの編集を行うことができる。Pythonによるプログラミングによって、データ解析を行うことができる。SLURMを用いた大規模データ解析を行うことができる。
This course is desinged to help students to perform their own analysis in the supercomputer system.
The goals of this study include,
-understanding the architecture of the supercomputer systems and the data that can be used in the systems
-accessing to and analyzing with the supercomputer
-upload important data to the supercomputer and prepare analysis environment
-being able to use basic Linux commands
-editing programs and data in Linux enviroment
-analyzing data through Python programming
-performing large-scale data analysis by using SLURM

授業内容・方法と進度予定/Contents and progress schedule of the class

以下の内容を適宜進める（岩崎・宮内・高屋・高山・城田・船山・田宮）
・ToMMoスパコンのアカウント作成
・スパコンのログインとスパコンとの間のデータ転送
・Linuxを用いた解析
・Pythonを用いた解析
・SLURMを用いた解析
・各自が持つ医療データ（ゲノム等）を用いてスパコンで解析を行う
The course includes following contents
・Creating an account to the ToMMo supercomputer
・Log in and transfer data to the supercomputer
・Data analysis with Linux
・Data analysi with Python
・Learn about how to use supercomputer to analyzes student’s own data (e.g. genomic data).

成績評価方法/Evaluation method

出席と演習課題の内容で評価する
Evaluate submitted report, attendance and so on.

教科書および参考書/Textbook and References

授業時間外学修/Preparation and Review

適宜計算機環境を利用して研究を行う
Students are required to study by using computer environment.