2016 Computational Linguistics

Font size  SML

Register update notification mail Add to favorite lecture list
Academic unit or major
Graduate major in Information and Communications Engineering
Instructor(s)
Takamura Hiroya 
Class Format
Lecture     
Media-enhanced courses
Day/Period(Room No.)
Mon3-4(G224)  Thr3-4(G224)  
Group
-
Course number
ICT.H410
Credits
2
Academic year
2016
Offered quarter
2Q
Syllabus updated
2016/4/27
Lecture notes updated
-
Language used
English
Access Index

Course description and aims

To make use of huge text data such as web text, automatic processing with computers is necessary. In natural language processing, computers recognize words from text data represented as sequences of characters, find phrases, and estimate the syntactic structures. This course is designed to give students an opportunity to learn the basic idea and knowledge, and the methods especially based on machine learning. Applications and their mathematical models are also topics of this course, such as machine translation, text summarization, and sentiment analysis. Mathematical approaches to language study are also briefly explained.

Student learning outcomes

By the end of this course, students will have acquired the following skills:
(i) read and understand research papers in the natural language processing field
(ii) use basic techniques of natural language processing such as part-of-speech tagging and syntactic parsing
(iii) derive mathematical formula of basic machine learning methods used in natural language processing

Keywords

computational linguistics, natural language processing, machine learning, text mining

Competencies that will be developed

Specialist skills Intercultural skills Communication skills Critical thinking skills Practical and/or problem-solving skills

Class flow

At the beginning of each class, assignments given in the previous class are reviewed, followed by a lecture.
Homework assignments include reading assignments, exercise problems, and programming assignments.

Course schedule/Required learning

  Course schedule Required learning
Class 1 Part-of-speech tagging with HMM Understand the probabilistic model of HMM-based POS tagging and its decoding with dynamic programming.
Class 2 Text classification with naive bayes classifier Learn the multinomial model and the multi-variate Bernoulli model of naive bayes classifiers and learn the idea of generative models.
Class 3 Basic knowledge of optimization and parameter estimation Learn the constrained optimization based on the method of Lagrange multipliers and its application to parameter estimation.
Class 4 Mathematical representation of document and classification with support vector machines Learn the bag-of-words representation of a document and its variant, as well as the classification with support vector machine,
Class 5 Named-entity recognition and dependency parsing with sequential tagging Understand how the named-entity extraction and dependency analysis are implemented as sequential classification.
Class 6 Probabilistic model for sequential tagging Understand the log-linear model and its variant for sequence data: conditional random fields.
Class 7 Text summarization Learn the basic knowledge on text summarization and understand the importance of optimization problems in this task.
Class 8 Methods for text clustering Learn k-means clustering, Gaussian mixture clustering, EM algorithm, probabilistic latent semantic analysis.
Class 9 Generative models of documents Understand the latent Dirichlet allocation and the Gibbs sampling for it.
Class 10 Language resources and algorithm implementation Obtain the knowledge of various language resources and tools and learn how to use them.
Class 11 Sophisticated methods for representing words, sentences, and documents Learn the distributed representations of words, sentences and documents.
Class 12 Sentiment analysis of text Learn various tasks and their methods for sentiment analysis of text.
Class 13 Machine translation Learn about the IBM model, which is a statistical machine translation model, and understand the basic part of its algorithm.
Class 14 Basic knowledge for language study Learn the computational methods that are used for language study and the research areas for which computational methods are useful.
Class 15 Mathematical methods for language study Learn the instances of language studies with computational approaches.

Textbook(s)

None.

Reference books, course materials, etc.

None.

Assessment criteria and methods

Students' knowledge and practical skills of natural language processing and mathematical models for language will be assessed.
Exercise problems 40%, term paper 60%.

Related courses

  • ICT.H508 : Language Engineering
  • ART.T459 : Natural Language Processing
  • ICT.S311 : Machine Learning (ICT)
  • ART.T458 : Machine Learning

Prerequisites (i.e., required knowledge, skills, courses, etc.)

None.

Page Top