In the current society, it is essential in all fi elds to appropriately exploit "big data" for finding rules and/or makingpredictions/decisions. This course gives fundamental knowledges and basic skills for handling large-scale data sets with the aid ofcomputers.
Students will be able to apply basic knowledges on statistics for analyzing data and evaluating the obtained results mathematically.
classification, clustering, principal component analysis, dimension reduction, training/generalization errors, cross validation
✔ Specialist skills | Intercultural skills | Communication skills | Critical thinking skills | ✔ Practical and/or problem-solving skills |
Lecture is given via Zoom
Course schedule | Required learning | |
---|---|---|
Class 1 | Class guidance | Guidance for class flow, computing environment, and programming language (Python) |
Class 2 | Fundamentals of data analysis | Learn basic knowledge about statistics and data science |
Class 3 | Classification and model evaluation | Learn methods for extracting discrimination rules from labeled data. Learn about difference between training error and generalization error, and methods of model evaluation. |
Class 4 | Clustering | Learn methods for categorizing unlabeled data into several categories |
Class 5 | Principal component analysis | Learn principal component analysis together with mathematical issues related to it |
Class 6 | Dimension reduction | Learn methods for dimension reduction such as multidimensional scaling, canonical correlation analysis and graph embedding |
Class 7 | Ensemble learning | Learn methods for ensemble learning |
To enhance effective learning, students are encouraged to spend approximately 100 minutes preparing for class and another 100 minutes reviewing class content afterwards (including assignments) for each class.
They should do so by referring to textbooks and other course material.
None
Distributed via T2SCHOLA
Based on quizzes in class/reports.
Basic knowledge of linear algebra, differential and integral calculus, and mathematical statistics is required.
Students of the doctor course are required to register XCO.T677 "Fundamentals of progressive data science" instead of XCO.T487"Fundamentals of data science."