2021 | Advanced Machine Learning

Undergraduate
Graduate

2021　Advanced Machine Learning

Font size S M L

Academic unit or major: Graduate major in Artificial Intelligence

Instructor(s): Okazaki Naoaki Shimosaka Masamichi

Class Format: Lecture

Media-enhanced courses

Day/Period(Room No.): Tue3-4() Fri3-4()

Group: -

Course number: ART.T458

Credits: 2

Academic year: 2021

Offered quarter: 2Q

Syllabus updated: 2021/3/19

Lecture notes updated: -

Language used: English

Access Index

Syllabus

Course description and aims

This course introduces basic knowledge of machine learning and deep learning.

Student learning outcomes

[Goal]
- Understand basic concepts (e.g., classification, convex optimization) and methods (e.g., stochastic gradient descent, back propagation) for discriminative models of machine learning.
- Realize machine learning with toolkits and programming.
[Theme] The first half of this lecture covers basic concept of machine learning with linear models and optimization. The second half of this lecture presents the fundamentals and practices of deep learning.

Keywords

Machine learning, regression, classification, optimization, linear model, neural network, deep learning

Competencies that will be developed

✔ Specialist skills

Intercultural skills

Communication skills

Critical thinking skills

Practical and/or problem-solving skills

Class flow

This lecture includes explanations and exercises of machine learning toolkits.

Course schedule/Required learning

	Course schedule	Required learning
Class 1	introduction	Basic concept of Machine Learning
Class 2	Linear Model 1	Loss functions, empirical loss minimization, overfitting, regularization, bias and variance, linear model (linear regression)
Class 3	Optimization 1	Concept of optimization, gradient methods, constraint optimization.
Class 4	Optimization 2	Convex optimization, Duality
Class 5	Linear Model 2	Linear model (classification)，logistic regression, linear and kernel support vector machines
Class 6	Linear Model 3	L1 regularization, sparse learning, Lasso
Class 7	Scalable Learning	Stochastic gradient, accelerated gradients, moment, mini-batch, distributed parallel training
Class 8	Introduction to Deep Learning	Real-world applications
Class 9	Feedforward Neural Network (I)	binary classification, Threshold Logic Units (TLUs), Single-layer Perceptron (SLP), Perceptron algorithm, sigmoid function, Stochastic Gradient Descent (SGD), Multi-layer Perceptron (MLP), Backpropagation, Computation Graph, Automatic Differentiation, Universal Approximation Theorem
Class 10	Feedforward Neural Network (II)	multi-class classification, linear multi-class classifier, softmax function, Stochastic Gradient Descent (SGD), mini-batch training, loss functions, activation functions, dropout
Class 11	Convolutional Neural Network	convolution, image filter, pooling, convolutional neural network, ImageNet, AlexNet, ResNet
Class 12	Word embeddings	word embeddings, distributed representation, distributional hypothesis, pointwise mutual information, singular value decomposition, word2vec, word analogy, GloVe, fastText
Class 13	DNN for structural data	Recurrent Neural Networks (RNNs), Gradient vanishing and exploding, Long Short-Term Memory (LSTM), Gated Recurrent Units (GRUs), Recursive Neural Network, Tree-structured LSTM
Class 14	Encoder Decoder Modeling	language modeling, Recurrent Neural Network Language Model (RNNLM), encoder-decoder models, sequence-to-sequence models, attention mechanism, Convolutional Sequence to Sequence (ConvS2S), Transformer, ELMo, BERT

Out-of-Class Study Time (Preparation and Review)

To enhance effective learning, students are encouraged to spend approximately 100 minutes preparing for class and another 100 minutes reviewing class content afterwards (including assignments) for each class.
They should do so by referring to textbooks and other course material.

Textbook(s)

Handouts will be given when necessary.

Reference books, course materials, etc.

- Ian Goodfellow, Yoshua Bengio, and Aaron Courville. Deep Learning. MIT Press. 2016.
- Christopher Bishop. Pattern Recognition and Machine Learning (Information Science and Statistics), 2010

Assessment criteria and methods

Course marks are based on assignments (70%) and exercises (30%).

Related courses

MCS.T507 ： Theory of Statistical Mathematics
MCS.T403 ： Statistical Learning Theory
CSC.T352 ： Pattern Recognition
CSC.T272 ： Artificial Intelligence
CSC.T242 ： Probability Theory and Statistics

Prerequisites (i.e., required knowledge, skills, courses, etc.)

None

Other

None.

TOKYO INSTITUTE OF TECHNOLOGY