In the current society, it is essential in all fields to appropriately exploit "big data" for finding rules and/or making predictions/decisions. This course gives fundamental knowledges and basic skills for handling large-scale data sets with the aid of computers.
Students will be able to apply basic knowledges on statistics for analyzing data and evaluating the obtained results mathematically.
classification, clustering, principal component analysis, dimension reduction, training/generalization errors, cross validation
✔ Specialist skills | Intercultural skills | Communication skills | Critical thinking skills | ✔ Practical and/or problem-solving skills |
In each week we give a lecture and an exercise session.
Course schedule | Required learning | |
---|---|---|
Class 1 | Class guidance | Guidance for class flow, computing environment, and used programming language (Python) |
Class 2 | Prerequirement exam - data mining, and - basic statistics and mathematics for data analysis | Check basic knowledge about mathematics and Python language |
Class 3 | Fundamentals of data analysis | Learn basic knowledge about statistics and data science |
Class 4 | Arrangement of computing environment and warming-up of programming | Arrange computing environment and carry out simple excercises of programming |
Class 5 | Classification and model evaluation | Learn methods for extracting discrimination rules from labeled data. Learn about difference between training error and generalization error, and methods of model evaluation. |
Class 6 | Classification | Do exersises on methods for extracting discrimination rules from labeled data |
Class 7 | Clustering | Learn methods for categorizing unlabeled data into several categories |
Class 8 | Clustering | Do exersises on methods for categorizing unlabeled data into several categories |
Class 9 | Principal component analysis | Learn principal component analysis together with mathematical issues related to it |
Class 10 | Principal component analysis | Do exersises on principal component analysis with mathematical issues related to it |
Class 11 | Dimension reduction | Learn methods for dimension reduction such as multidimensional scaling and canonical correlation analysis |
Class 12 | Dimension reduction | Do exersises on methods for dimension reduction such as multidimensional scaling and canonical correlation analysis |
Class 13 | Advanced topics | Learn methods for ensemble learning |
Class 14 | Advanced topics | Do exersises on methods for ensemble learning |
Class 15 | General discussion | Discuss possible applications of data analysis in various fields |
Not specified
Distributed via OCW-i
Based on reports for given assignments.
Take a prerequirement exam on "linear algebra", "analysis", and "basic grammar and functions of Python3" in the first class.
Only TAC-MI students can register this course in 2019. Students are required to get Google accounts and to get ready for using functions of "file upload/download" in Google Drive.