This course provides fundamental knowledge and applied techniques related to information organization and retrieval, which are necessary for utilizing information, specifically textual data. The content consists of information retrieval and related technology parts. The information retrieval part consists of techniques supporting information retrieval systems and methods for evaluating those techniques. The related technique part consists of information filtering, document categorization, clustering, Web mining, and recommendation systems.
This course aims to teach knowledge and skills to see through inside information retrieval systems, by means of related techniques and their evaluation. This course also aims to identify the relation between artificial intelligence research, such as natural language processing and Web mining.
Students will be able to explain the following items.
(a) interaction between a user and computer in information retrieval
(b) architectures for information retrieval systems and techniques for each component
(c) experiments, data sets, and interpretation and presentation of results related to the evaluation for information retrieval
(d) techniques related to information retrieval and organization
information retrieval, information organization, information needs, indexing, term weighting, retrieval model, relevance feedback, test collections, information filtering, document categorization, clustering, Web mining, and recommendation systems, natural language processing, artificial intelligence
✔ Specialist skills | Intercultural skills | Communication skills | Critical thinking skills | Practical and/or problem-solving skills |
The material is organized as presentation slides and the following three steps are repeated on a slide-by-slide basis: 1) students take a note of the slide projected on a screen, 2) the content is explained, and 3) Q&A and optional exercises. The material is available only on a screen and thus students must take notes during the class. Every student is provided with sufficient time to finish taking notes before proceeding to the next slide.
Course schedule | Required learning | |
---|---|---|
Class 1 | Introduction to information retrieval | Explain information retrieval as a human behavior and an interactive processing with a computer system. |
Class 2 | Information needs | Explain information needs focusing on the relation to queries. |
Class 3 | Indexing | Explain the purpose and process of indexing. |
Class 4 | Extracting index terms | Explain the type and extraction methods of index terms, such as characters and words. |
Class 5 | Term weighting | Explain the motivation and effect of term weights. |
Class 6 | Boolean model | Explain the concept, implementation, and process for the Boolean model. |
Class 7 | Vector space model | Explain the concept, implementation, and process for the vector space model. |
Class 8 | Relevance feedback | Explain the concept and implementation of the relevance feedback. |
Class 9 | Evaluation for information retrieval | Explain the motivation, purpose, and method of the evaluation for information retrieval. |
Class 10 | Test collections for information retrieval | Explain the role of test collections in the evaluation for information retrieval. |
Class 11 | Evaluation measures for information retrieva | Explain interpretation and presentation for experimental results using representative measures for the evaluation of information retrieval. |
Class 12 | Techniques related to information retrieval | Explain techniques related to information retrieval, such as information filtering and document categorization. |
Class 13 | Web mining | Explain Web mining from content, structure, and usage of the Web. |
Class 14 | Web retrieval model | Explain the model for Web retrieval that uses Web mining |
Class 15 | Recommender systems | Explain advantages and disadvantages of different methods for recommendation systems. |
No textbook
Manning, C. D., Raghavan, P, and Schutze, H. Introduction to Information Retrieval, Cambridge University Press, 2008.
Liu, B. Web Data Mining, Springer, 2007.
written examinations: midterm (50%) and term-end (50%)
No requirement