he web has rapidly become the default medium for publishing and disseminating information, particularly via social media platforms (encompassing blogs, forums, Q&A sites, microblogs, and content sharing/recommender services), and with a strong focus on textual and image/video data. The volume, veracity, and
variety of that data is ever increasing, creating a need for AI technologies that assist users in accessing and making sense of data that is of pertinent to them and contexualised appropriately. In this subject, we will introduce a range of such AI technologies situated across a breadth of web sources and applications, including single and multi-document summarisation, document filtering, user/document geolocation, stance prediction, question matching, answer aggregation/ranking, and user demographic prediction/debiasing.
This course is for Doctoral Students with an interest in NLP. The ideal participant in this course already has some background in general NLP, but beginners in NLP are also welcome. The topic covered concerns many technologies that go beyond basic NLP. However, all background information that is required to understand the main concepts treated in this course will be summarised in the first set of lectures+activities.
Students will become acquainted with web-related AI technologies, such as summarisation, question answering, sentiment detection, document filtering, question matching, answer ranking and aggregation and more. Students will experience the various methods themselves in first-hand in exercises, for instance by performing annotation exercises on the real text. This way, they will get a real-world encounter with the tasks and texts that automatic methods face when providing access to the web resources, rather than just seeing idealised examples chosen for illustrative purposes in the literature.
As a result, students who directly work on related research should be able to put the new knowledge from this course into practice by creating more sophisticated automatic treatments of their chosen tasks. But students not directly working on NLP should also benefit. Background knowledge about AI-web technology should help them with the design of a range of multi-modal applications. All students, whatever their exact subjects, could also profit by learning to design more meaningful evaluations of their systems.
Web resources, text understanding, artificial intelligence, natural language processing
✔ Specialist skills | Intercultural skills | Communication skills | Critical thinking skills | ✔ Practical and/or problem-solving skills |
Students will be required to read one paper (8-15 pages) per topic (6 papers in total). Because the course is intensive, they should ideally do some of this reading before the course starts. Reading will be provided well ahead of schedule.
Course schedule | Required learning | |
---|---|---|
Class 1 | Lecture 1 Basics of NLP and IR | Specified in the class |
Class 2 | Activity 1 Basics of NLP and IR | |
Class 3 | Lecture 2 Text Summarisation | |
Class 4 | Activity 2 Text Summarisation | |
Class 5 | Lecture 3 Citation Processing and Search | |
Class 6 | Activity 3 Citation Processing and Search | |
Class 7 | Evaluation 1 | |
Class 8 | Lecture 4 NLP for social media: document preprocessing | |
Class 9 | Activity 4 NLP for social media: document preprocessing | |
Class 10 | Lecture 5 NLP for social media: semantics and discourse | |
Class 11 | Activity 5 NLP for social media: semantics and discourse | |
Class 12 | Lecture 6 NLP for social media: socially-situated NLP | |
Class 13 | Activity 6 NLP for social media: socially-situated NLP | |
Class 14 | Evaluation 2 |
None
Specified in the class
An exam will be conducted on all topics covered in the course.
None