2017 | Speech Information Processing

2017　Speech Information Processing

Font size S M L

Academic unit or major: Graduate major in Artificial Intelligence

Instructor(s): Shinoda Koichi

Class Format: Lecture

Media-enhanced courses

Day/Period(Room No.): Mon7-8(W831,G111) Thr7-8(W831,G111)

Group: -

Course number: ART.T460

Credits: 2

Academic year: 2017

Offered quarter: 3Q

Syllabus updated: 2017/3/17

Lecture notes updated: -

Language used: English

Access Index

Syllabus

Course description and aims

This course focuses on speech information processing. It first teaches basic knowledge about human speech and natural languages, then, introduces speech information processing by machine． Next, it explains the elements of automatic speech recognition systems which include acoustic models, language models, search algorithms, and some techniques to improve the performance and robustness of the systems such as optimization, adaptation, discriminative training. Finally, it introduces other applications of speech information processing such as speech synthesis, speaker recognition.

Student learning outcomes

At the end of this course, students will be able to:
1) explain the mechanism of human speech production and perception,
2) explain each component of speech recognition systems,
3) have an understanding of the importance of probabilistic modeling in speech recognition and explain its training and recognition algorithm ,
4) build a speech recognition system by their own.

Keywords

speech information processing, speech recognition, speech analysis, speech coding, speech synthesis, acoustic models, language models, search algorithms, graphical models, hidden Markov models

Competencies that will be developed

✔ Specialist skills

Intercultural skills

Communication skills

Critical thinking skills

✔ Practical and/or problem-solving skills

Class flow

1) At the beginning of each class, the contents of the previous class are reviewed.
2) At the end of each class, an assignment is given, which should be submitted in the next class.
3) Attendance is taken in every class.
4) Students are recommended to learn the topics by themselves before coming to class.

Course schedule/Required learning

	Course schedule	Required learning
Class 1	Speech and Language	Explain in the class.
Class 2	Speech analysis, Speech coding	Explain in the class.
Class 3	Introduction of speech recognition	Explain in the class.
Class 4	Graphical models	Explain in the class.
Class 5	Hidden Markov models	Explain in the class.
Class 6	Recognition and training algorithms	Explain in the class.
Class 7	Language models	Explain in the class.
Class 8	Search algorithms	Explain in the class.
Class 9	Optimization, adaptation	Explain in the class.
Class 10	Noise robustness	Explain in the class.
Class 11	Discriminative training for speech recognition	Explain in the class.
Class 12	Speech recognition applications	Explain in the class.
Class 13	Speech synthesis, voice conversion	Explain in the class.
Class 14	Speaker recognition	Explain in the class.
Class 15	Future prospects	Explain in the class.

Textbook(s)

None.

Reference books, course materials, etc.

S.Furui 著『Digital Speech Processing，Synthesis，and Recognition』　Mercel Dekker

Assessment criteria and methods

Students course scores are based on an assignment in every class (20% in total) and two reporting assignments (80% in total).

Related courses

ART.T547 ： Multimedia Informaiton Processing

Prerequisites (i.e., required knowledge, skills, courses, etc.)

Students are required to have the knowledge on computer science of undergraduate levels.

Other

None.

TOKYO INSTITUTE OF TECHNOLOGY