This lecture series will cover high performance parallel computing, which is important technology to support evolution of science and technology. The demand for high performance computing, involving extremely large amount of computation and data, is increasing for molecular simulation, protein analysis, mathematical optimization problems, and so on.
The contents will include standard parallel programming tools, MPI and OpenMP, and also include a programming tool for GPU/accelerators. In addition to lecture, there will be programming training using the peta-scale supercomputer TSUBAME at Tokyo Tech.
[Objective] Participants shall get used to usage of parallel programming tools such as MPI, OpenMP and GPU tools, through the lecture and training. Objectives also include learning of basic technologies for performance analysis, through training using the TSUBAME supercomputer.
[Theme] For efficient usage of modern computing systems including supercomputers, the knowledge of architecture, such as parallelism and hierarchy, is necessity. Theme includes programming with OpenMP and MPI, each of which has corresponding hierarchy, and programming on many-core accelerators (GPGPU).
parallel computing, high performance computing, multi-core, MPI, OpenMP, GPGPU
|✔ Specialist skills||Intercultural skills||Communication skills||Critical thinking skills||✔ Practical and/or problem-solving skills|
Lecture about parallel and high performance computing, and training of programming using Tokyo Tech TSUBAME supercomputer will be given.
|Course schedule||Required learning|
|Class 1||Introduction to supercomputers and parallel computing||Nothing|
|Class 2||Overview and usage of TSUBAME supercomputer/ parallel programming models||Nothing|
|Class 3||Shared memory parallel programming with OpenMP (1) Introduction||Report for shared memory part|
|Class 4||Shared memory parallel programming with OpenMP (2) Data Parallelism and Attributes of variables||Report for shared memory part|
|Class 5||Shared memory parallel programming with OpenMP (3) Task Parallelism||Report for shared memory part|
|Class 6||Shared memory parallel programming with OpenMP (4) Mutual exclusion and bottlenecks||Report for shared memory part|
|Class 7||Distributed memory parallel programming with MPI (1) Introduction||Report for distributed memory part|
|Class 8||Distributed memory parallel programming with MPI (2) Non-blocking communicatoin||Report for distributed memory part|
|Class 9||Distributed memory parallel programming with MPI (3) Optimization of MPI programs||Report for distributed memory part|
|Class 10||Distributed memory parallel programming with MPI (4) Network topology||Report for distributed memory part|
|Class 11||GPU programming (1) Introduction||Report for GPU part|
|Class 12||GPU programming (2) GPU Programming with OpenACC||Report for GPU part|
|Class 13||GPU programming (3) GPU Programming and Parallelism||Report for GPU part|
|Class 14||GPU programming (4) Optimization of GPU programs||Report for GPU part|
To enhance effective learning, students are encouraged to spend approximately 100 minutes preparing for class and another 100 minutes reviewing class content afterwards (including assignments) for each class.
They should do so by referring to textbooks and other course material.
Nothing particular. Information on related references and web sites will be given.
Based on the report submission. Attendance to classes is also considered.
Basic knowledge on C language, especially pointers
Basic knowledge on Linux commands