QuickSyllabus

機械学習基礎

前期木曜日４講時. 単位数/Credit(s): 2. 対象学科・専攻/Departments: 情報基礎科学専攻、システム情報科学専攻、人間社会情報科学専攻、応用情報科学専攻. 学期/Term: 前期. 履修年度: 2024. 使用言語: 英語（需要に応じて、日本語の資料も提供される場合があります） English (Japanese materials may be provided, depending on demand).

開講年度

2024

授業題目/Class Subject

機械学習基礎
Machine Learning Basics

授業の目的・概要及び達成方法等

このデータ科学コースは、データ科学で使用される基本的な技術とツールについて紹介することを目的としています。毎週、データ科学のパイプラインから始まり、ニューラルネットワークや時系列解析などの高度なトピックまで、一つずつトピックをカバーします。それらは洗練されたスライドとわかりやすいPythonコードを用いて説明されます。

この学期では、すべてのトピックを網羅できる1つの包括的なデータセットを使用します。これにより、学生がデータセットを理解するために多くの時間を費やすことなく、データサイエンスの概念を容易に理解できます。

達成方法：
- 講義: 基本的な概念を教え、その後Pythonでデモンストレーションを行います。
- 授業内演習: 短い実践活動。（主にコードをコピー&ペーストし、変数を変更ぐらい。）
- 週次実践コース: 1.5時間の課題で理解を深めます。

授業の目的・概要及び達成方法等(Ｅ）

This data science course aims to introduce you to the essential techniques and tools used in Data Science. Each week, we will cover a unique topic, starting from the very basics of the Data Science pipeline to advanced topics like Neural Networks and Time-Series Analysis, explained using a sophisticated slides and easy-to-understand Python codes.

For this semester, we will use a single comprehensive dataset that could cover all of the topics, to make it easier for students to understand the concepts of data science, without spending too much time understanding the dataset.

Methods of Achievement:
- Lecture: Basic concepts will be taught, followed by a Python demonstration.
- In-Class Exercises: Short hands-on activities (no extensive coding skills required!).
- Weekly Practice Course: 1.5-hour challenges to deepen your understanding.

学修の到達目標/Goal of Study

このコースの終了時には、以下を理解して適用できるようになることが期待されます：

1. データサイエンスのパイプラインを理解する。
2. さまざまな機械学習手法を適用する。
3. モデルのパフォーマンスを評価し、ハイパーパラメータを微調整する。
4. ニューラルネットワーク、テキストマイニング、時系列解析を理解し適用する。
5. Pythonを使って理論を実践に移す。

By the end of this course, you should be able to:

1. Understand the Data Science Pipeline.
2. Apply various machine learning techniques.
3. Evaluate model performance and fine-tune hyperparameters.
4. Understand and apply Neural Networks, Text Mining, and Time-Series Analysis.
5. Translate theory into practice using Python.

授業内容・方法と進度予定/Contents and progress schedule of the class

講義は，4,5つのセッションに分けて行います．1つのセッションは，約15-20分です。
Each lecture will be subdivided into 4-5 smaller sections (15-20 minutes per section).

1. Data Science Pipeline - データサイエンスパイプライン
2. Data Preprocessing - データ前処理
3. Data Exploration - データ探索
4. Classification - 分類
5. Regression - 回帰
6. Ensemble Methods - アンサンブル方法
7. Model Evaluation and Hyperparameter Tuning - モデル評価とハイパーパラメータチューニング
8. Multi-class Classification - 多クラス分類
9. Dimensionality Reduction - 次元削減
10. Clustering - クラスタリング
11. Anomaly Detection - 異常検出
12. Neural Networks - ニューラルネットワーク
13. Text Mining & NLP - テキストマイニング＆自然言語処理
14. Time-Series Analysis - 時系列分析
15. Advanced Model Interpretability - 高度なモデル解釈性

(The topic of each week might be adjusted depends on the circumstances)

成績評価方法/Evaluation Method

週次実践コース：80%
授業内演習：20%
最終試験やレポートはありません。
正当な理由がある場合、講義時間と実習セッションの両方でオンライン参加が可能です。

Weekly Practice Course: 80%
In-Class Exercises: 20%
There will be no report and no final exam.
With valid excuse, online participation are possible for both lecture time and practice session.

教科書および参考書/Textbook and references

授業時間外学修

このコースは包括的に設計されていますが、追加の学習と練習が推奨されます。独自の学習をサポートするために、オンラインリソース（LLM promptや）と推奨読書が共有されます。

授業時間外学修（Ｅ）

Although the course is designed to be comprehensive, additional study and practice are encouraged. Online resources (such as LLM prompts) and recommended reading will be shared to assist your independent study.

オフィスアワー

オフィスアワーは設けておりませんが、コースに関する質問や不明点は、samy.baladram@tohoku.ac.jp までメールでお問い合わせください。

オフィスアワー（Ｅ）

Office hours are not available for this course. For any inquiries or clarifications related to the course, please email samy.baladram@tohoku.ac.jp.

その他/In addition

実習のため、ラップトップを持参してください。PCが必要な場合は、事前にお知らせください。
Please bring your laptop for practice sessions. If you need a PC, let us know in advance.