In the current society, it is essential in all fields to appropriately exploit "big data" for finding rules and/or making predictions/decisions. This course gives fundamental knowledges and basic skills for handling large-scale data sets with the aid of computers.
Students will be able to apply basic knowledges on statistics for analyzing data and evaluating the obtained results mathematically.
classification, clustering, principal component analysis, dimension reduction, training/generalization errors, cross validation
✔ Specialist skills | Intercultural skills | Communication skills | Critical thinking skills | ✔ Practical and/or problem-solving skills |
This classes will be given in Ookayama campuses. In class, students are required to solve exercise problems that are linked with the contents of taught course ``XCO.T487 Fundamentals of data science".
Course schedule | Required learning | |
---|---|---|
Class 1 | Class guidance | Prepare software enviroments for exercises. Exercise of operating a Linux system. |
Class 2 | Prerequirement exam - data mining, and - basic statistics and mathematics for data analysis | Understand the general idea on computation and data mining, and obtain basic knowledges for this course. |
Class 3 | Fundamentas of data analysis | Understand the outline of data mining, and explain what is sample data set. |
Class 4 | Arrangement of computing environment and warming-up of programming | Understand the outline of classification, and explain simple rule and naiive baysesian rule derivations. |
Class 5 | Classification and model evaluation | Understand simple classification rule generation mechanism and use the generator appropriately. |
Class 6 | Classification | Explain the priciple idea of decision tree construction. |
Class 7 | Clustering | Undersdand the mechanism of decision tree construction algorithm and use the standard decsion tree constructor appropriately. |
Class 8 | Clustering | Explain a concept of association rule and its evaluation method. |
Class 9 | Principal component analysis | Undersdand the mechanism of association rule generation and use the standard generator appropriately. |
Class 10 | Principal component analysis | Undersdand a concept of regression and various evaluation methods. |
Class 11 | Dimension reduction | Undersdand the principle idea of deriving regression rules and use the standard regression rule derivation tools appropriately. |
Class 12 | Dimension reduction | Explain the idea of clustering and basic techniques for clustering. |
Class 13 | Exercise: clustering | Understand the mechanism of standard clustering techniques and use them for identifying clusters. |
Class 14 | Advanced topics | Choose appropriate data mining methods and tools given in this course, and overview advanced topics. |
Class 15 | General discussion | Choose an appropriate method for analyzing more practical (but still basic) data mining problems, and evaluate the obtained analysis. |
Not specified
Distributed via OCW-i
Based on reports for given assignments.
Take a prerequirement exam on "linear algebra", "analysis", and "basic grammar and functions of Python3" in the first class.
Questions can be sent by email (at any time).
Only TAC-MI students can register this course in 2019. Students are required to get Google accounts and to get ready for using functions of "file upload/download" in Google Drive.