This course provides an introduction to the field of natural language processing (NLP), introducing fundamental concepts and techniques for processing human languages by computers. The course covers a linguistic background necessary for NLP, morphological analysis, syntactic analysis, semantic analysis, discourse analysis and text generation. The course also includes a part of corpus linguistics.
Linguistic competence is believed to be the most prominent human nature that distinguishes human from other animals. The aim of this course is to provide students with the ability to utilise fundamental NLP techniques to build language-related application systems, such as information extraction, question answering and dialogue systems.
At the end of the course students should be able to
(1) explain basic concepts of linguistics;
(2) explain basic concepts of natural language processing;
(3) build sample application programs based on the above concepts.
computational linguistics, corpus linguistics, morphological analysis, syntactic analysis, semantic analysis, discourse analysis, language resources, text generation.
✔ Specialist skills | Intercultural skills | Communication skills | Critical thinking skills | Practical and/or problem-solving skills |
Each class starts with a discussion on the assignment of the previous class, including presentation by students on their solution, followed by a lecture on several specialised topics on the class. Students will have assignments after each class.
Course schedule | Required learning | |
---|---|---|
Class 1 | Introduction: A brief history of NLP, fundamental linguistic background → Specified in the class | Specified in the class. |
Class 2 | Morphological analysis (1): morphology, computational morphology, stemming | |
Class 3 | Morphological analysis (2): morphological analysis, POS tagset, POS tagging | |
Class 4 | Morphological analysis (3): rule-based morphological analysis, statistical morphological analysis | |
Class 5 | Syntactic analysis (1): syntax, generative grammars, unification grammars | |
Class 6 | Syntactic analysis (2): algorithms for syntactic parsing, top-down parsing, bottom-up parsing | |
Class 7 | Syntactic analysis (3): reachability, chart parsing | |
Class 8 | Semantic analysis (1):semantics, first order logic, knowledge representation | |
Class 9 | Semantic analysis (2): case grammar, case frame, selectional restriction, lexical semantics, representation for time and space | |
Class 10 | Semantic analysis (3): word sense disambiguation, semantic role labelling | |
Class 11 | Discourse analysis (1): pragmatics, speech act theory, Grician maxim, indirect speech act | |
Class 12 | Discourse analysis (2): reference analysis, centring theory | |
Class 13 | Discourse analysis (3): discourse structure, discourse structure analysis, rhetorical structure theory, dialogue management | |
Class 14 | Language resources: corpora, lexicons, annotation | |
Class 15 | Text generation: text planning, micro planning, realisation |
Not specified.
Handouts will be provided through the OCW-i system.
Pierre M. Nugues, Language Processing with Perl and Prolog, 2nd ed. Springer (2014).
Jurafsky, D. & Martine, J. H.: Speech and Language Processing (2nd ed.), Prentice Hall (2009).
Allen, J.: Natural Language Processing 2nd ed., Benjamin (1994).
Submitted reports of the assignments.
Ability of programming.
None.