This course provides a comprehensive overview of bioinformatics where living matters are modeled and analyzed as information systems. The fundamental notions and methods in genome sequence analysis, protein structural bioinformatics, and cheminformatics are introduced with illustrative examples of recent research.
This course is aiming to show students live instances of computing technology application in our society, especially via the combination of various mathematical methods in order to extract meanings from vast and vague real-world data.
By the successful completion of this course, students will be able to:
1) Explain fundamental knowledge on bioinformatics,
2) Explain novel mathematical methods to extract meanings from various data, and
3) Explain instances of computing technology application in society.
Genome sequence analysis, protein structural bioinformatics, cheminformatics, computational biology
✔ Specialist skills | Intercultural skills | Communication skills | ✔ Critical thinking skills | Practical and/or problem-solving skills |
Each class starts with the explanation of a new topic. In the class occasionally, students are given exercise problems to solve. Students are asked to submit final reports.
Course schedule | Required learning | |
---|---|---|
Class 1 | Overview, Pairwise sequence alignment | Genome sequence, Sequence analysis, Global/Local alignment |
Class 2 | Clustering, Phylogenetic tree | Hierarchical clustering, Distance Matrix, Bootstrap |
Class 3 | Multiple sequence alignment, Sequence motifs | Approximation methods for multiple alignments, Regular expression, Profile matrix, Hidden Markov model |
Class 4 | Sequence motifs (Cont'd), Coding region prediction | Markov model, Hidden Markov model |
Class 5 | Homology search from databases | E-value、P-value, FASTA、BLAST、PSI-BLAST |
Class 6 | Homology search from databases (Cont'd) , Sequence assembly | BLAT, GHOST, Hamilton path, Eulerian path |
Class 7 | Protein structure comparison, structure classification | Protein structure, structure-function relationship, structure comparison, structure classification |
Class 8 | Protein secondary structure prediction | Protein structure prediction based on machine learning methods |
Class 9 | Protein tertiary structure prediction | Comparative modeling, de novo prediction |
Class 10 | Protein docking simulation | Protein-protein docking, protein-ligand docking, virtual screening |
Class 11 | Molecular simulation | Molecular dynamics, quantum chemistry |
Class 12 | Comparison of chemical structure | SMILES, SMART, molecular fingerprint, MCS |
Class 13 | Molecular activity prediction | Neural fingerprint, graph convolution network |
Class 14 | Molecular design | Generative model, VAE, GAN, reinforce learning |
To enhance effective learning, students are encouraged to spend approximately 30 minutes preparing for class and another 120 minutes reviewing class content afterwards (including assignments) for each class.
They should do so by referring to textbooks and other course materials.
Original class slides are provided.
Mount, David. Bioinformatics: Sequence and Genome Analysis (2nd edition). Cold Spring Harbor Laboratory Press; ISBN-13: 978-087969712-9
Students' knowledge and their ability to apply them to solving problems will be assessed with final report.
None
Yutaka Akiyama: akiyama[at]c.titech.ac.jp
Takashi Ishida: ishida[at]c.titech.ac.jp