2023 | Robot Audition and Scene Analysis

Home
> School of Engineering
> Graduate major in Systems and Control Engineering
> Robot Audition and Scene Analysis

Undergraduate
Graduate

2023　Robot Audition and Scene Analysis

Font size S M L

Academic unit or major: Graduate major in Systems and Control Engineering

Instructor(s): Nakadai Kazuhiro

Class Format: Lecture (Face-to-face)

Media-enhanced courses

Day/Period(Room No.): Tue3-4(W9-325（W934）)

Group: -

Course number: SCE.I434

Credits: 1

Academic year: 2023

Offered quarter: 1Q

Syllabus updated: 2023/9/13

Lecture notes updated: -

Language used: English

Access Index

Syllabus

Course description and aims

In this course, students are expected to study the technologies on robot audition to realize the function of robot’s ears that can listen to simultaneous sounds, and scene understanding to analyze and understand the surrounding environment including sounds, from multiple perspectives, including the technologies in these research fields, their evolution, current progress, and future prospects. Specifically, the course will cover technologies for sound source localization, sound source separation, sound source classification, and automatic speech recognition, based on auditory processing, acoustic signal processing, and machine learning.

Student learning outcomes

By taking this course, students will acquire the following knowledge and skills:
 Understand and correctly explain multiple aspects of the research areas related to “sound” such as robot audition and scene understanding.
 Understand and explain sound-related technologies such as sound source localization, sound source separation, sound source classification, automatic speech recognition, and so on.

Course taught by instructors with work experience

✔ Applicable	How instructors' work experience benefits the course
✔ Applicable	The class will be given by a professor who has established and led this research field since 2000.

Keywords

robot audition, scene analysis, acoustic signal processing, machine learning, deep learning, sound source localization, sound source tracking, sound source separation, sound source classification, automatic speech recognition

Competencies that will be developed

✔ Specialist skills

Intercultural skills

Communication skills

✔ Critical thinking skills

✔ Practical and/or problem-solving skills

Class flow

Topics of each class are explained according to planned structure. Group discussion may be conducted. Written reports related to the contents of the class may be assigned.

Course schedule/Required learning

	Course schedule	Required learning
Class 1	Overview and evolution of robot audition and scene analysi	Understand the overview and evolution of robot audition and scene analysis research areas with their relationships.
Class 2	Auditory scene analysis and computational auditory scene analysis	Review acoustic signal processing as a basis for robot audition and understand computational auditory scene analysis as a prior research areas of robot audition.
Class 3	Binaural robot audition	Understand the technology to listen to simultaneous sounds with two ears/microphones as humans and animals do.
Class 4	Microphone array-based robot audition	Understand sound source localization and sound source separation techniques using a microphone array consisting of multiple microphones.
Class 5	Robot audition in extreme environments – extreme audition	Understand the challenges and approaches to solve them through the application of robot audition technology to extreme environments.
Class 6	Robot audition using deep learning	Understand the overview, mechanisms, and technology trends of sound source localization, sound source separation, sound source classification, and automatic speech recognition using deep learning with neural networks.
Class 7	Software platforms and the future of robot audition	Understand the design concept, advantages, and challenges of HARK, an open source software platform for robot audition, from the perspective of applying it to real-world problems. Also discuss the future of robot audition and scene analysis technology.

Out-of-Class Study Time (Preparation and Review)

To enhance effective learning, students are encouraged to spend approximately 100 minutes preparing for class and another 100 minutes reviewing class content afterwards (including assignments) for each class.
They should do so by referring to textbooks and other course material.

Textbook(s)

Unspecified.

Reference books, course materials, etc.

Unspecified.

Assessment criteria and methods

Comprehension and consideration of lecture content will be evaluated. Grading will be based on the assignments at each class and the final report.

Related courses

SCE.I433 ： Intelligent Communication and Social Interaction
SCE.I501 ： Image Recognition
SCE.I406 ： Machine Learning Framework

Prerequisites (i.e., required knowledge, skills, courses, etc.)

Students should have basic knowledge of digital signal processing and machine learning at the undergraduate level.

Contact information (e-mail and phone) Notice : Please replace from "[at]" to "@"(half-width character).

nakadai[at]ra.sc.e.titech.ac.jp

TOKYO INSTITUTE OF TECHNOLOGY