In this lecture, we will discuss information organization and retrieval as a way to easily find information about a particular subject. In the first half of the lecture, I introduce typical classification methods and explain the basics of classification and machine learning methods for classification. In addition, I explain the knowledge and techniques necessary for organizing information, such as identifiers and record identification. In the latter half of the lecture, I explain information retrieval techniques in detail. Basic retrieval models and evaluation methods are explained, followed by lectures and discussions on the latest techniques used in modern retrieval systems, such as learning to rank and online evaluation. We will see how these technologies work with Python and other tools.
The goal is to understand the basic ideas and concepts of information organization and retrieval, and to be able to put these into practice through exercises.
Information organization, information retrieval, classification, machine learning, metadata
✔ Specialist skills | Intercultural skills | Communication skills | Critical thinking skills | Practical and/or problem-solving skills |
This course will be taught with slides and programming exercises.
Course schedule | Required learning | |
---|---|---|
Class 1 | Fundamentals of Classification | Learn the definition of classification and classification systems. |
Class 2 | Major Classification Methods | Learn major classification methods (e.g., library classification). |
Class 3 | Document Classification (1) | Learn how to classify documents by using machine learning approaches. |
Class 4 | Document Classification (2) | Learn how to classify documents by using machine learning approaches. |
Class 5 | Document Classification Exercises | Exercises on document classification using Python. |
Class 6 | Identifiers and Identification (1) | Learn the types and characteristics of identifiers. |
Class 7 | Identifiers and Identification (2) | Learn the techniques of record identification. |
Class 8 | Basics of Information Retrieval (1) | Learn inverted index and basic retrieval models such as Boolean model. |
Class 9 | Basics of Information Retrieval (2) | Learn basic retrieval models such as vector space model and probability model. |
Class 10 | Evaluation of Information Retrieval | Learn how to build test collections and evaluation metrics. |
Class 11 | Information Retrieval Exercises | Practice information retrieval using Elasticsearch. |
Class 12 | Learning to Rank (1) | Learn how to perform ranking by machine learning. |
Class 13 | Learning to Rank (2) | Learn how to perform ranking by machine learning. |
Class 14 | Online Evaluation | Learn how to evaluate search systems in a real service. |
To enhance effective learning, students are encouraged to spend approximately 100 minutes preparing for class and another 100 minutes reviewing class content afterwards (including assignments) for each class.
They should do so by referring to textbooks and other course material.
None
Handouts are provided through a Web site.
Course marks are based on exercises (source codes, 50%) and assignments (report contents, 50%).
No requirement
None