Recently, a new high performance computing for processing large amount of data is demanded by many fields, since the amount of data is rapidly increasing. To cope with this issue, researchers have developed new computational models, algorithms, highly productive software design methods in order to effectively use modern computer systems with high performance hardware.
This course aims to provide students with the topics on the frontier of high performance computing and its concepts, computational models, and techniques. More specifically, it introduces new algorithms and systems by using modern high performance hardware from two aspects: cloud storage, MapReduce framework from a macro viewpoint, and concurrent data management for multicore processors, such as lock-free algorithms, transactional memory, from a micro viewpoint.
At the end of this course, students will:
- Understand the organization of modern computer systems with high performance hardware,
- Understand the concepts and techniques of new parallel processing models and algorithms to use these systems,
- Understand highly abstractive parallel computing environments for cloud computing such as cloud storage and the MapReduce framework,
- Understand advanced algorithms for high performance computing and concurrent data access methods on modern multicore processors, and
- Be able to apply the acquired computational models and algorithms to emerging software used in many fields that need to deal with a large amount of data and/or high performance computing as well as in the IT field.
cloud ｃomputing, cloud storage, MapReduce framework, ｃache-conscious algorithm, concurrent data access, lock-free algorithm, transactional memory
|✔ Specialist skills||Intercultural skills||Communication skills||Critical thinking skills||Practical and/or problem-solving skills|
Students must thoroughly review the subjects described in the required learning section and study their related topics by themselves after each class.
|Course schedule||Required learning|
|Class 1||Large data processing by cloud computing||Understand the present situation of cloud services.|
|Class 2||Key-value store and its data model||Understand the properties of key-value stores.|
|Class 3||Consistency models for cloud storage||Understand the difference between the consistency model in a database and that in cloud storage.|
|Class 4||Organization of cloud data storage||Understand distributed algorithms used by cloud storage and their purposes.|
|Class 5||MapReduce framework and its computational model||Understand the roles and restrictions of each phase in MapReduce framework.|
|Class 6||Large text processing algorithms for MapReduce||Understand an algorithm to build an inverted index with MapReduce.|
|Class 7||Large graph processing algorithms for MapReduce||Understand the calculation of PageRank with MapReduce.|
|Class 8||Memory hierarchy and high performance computing||Understand the relationship between data access and its effect to cache memory.|
|Class 9||Cache-conscious data allocation||Understand the relationship between cache-conscious data allocation and cache memory.|
|Class 10||Cache-conscious search algorithms||Understand the relationship between cache-conscious search algorithms and cache memory.|
|Class 11||Atomic operation and synchronization||Understand basic concurrent data accessing methods with locking.|
|Class 12||Formal model for concurrent threads||Understand the definition of linearizability.|
|Class 13||Concurrent data access: lock-free algorithms||Concurrent data access: lock-free algorithms Understand some concurrent data accessing methods without locking.|
|Class 14||Concurrent data access: software transactional memory||Understand the principle of software transactional memory.|
|Class 15||Concurrent data access: hardware transactional memory||Understand the mechanism and limitation of best-effort hardware transactional memory.|
None required. Handouts used in class can be found on OCW-i.
J. Lin, C. Dyer, "Data-Intensive Text Processing with MapReduce", Morgan & Claypool Publisher
T. Harris, J. Larus, R. Rajwar, "Transactional Memory", 2nd edition, Morgan & Claypool Publisher
Students will be assessed on their understanding of principles of cloud computing and its parallel processing, the relationship between memory hierarchy and algorithms, and the basics of concurrent data access. Students’ course scores are based on a midterm assignment (40%) and a final exam (60%).
Having the following knowledges is desireble:
- Distributed algorithms
- Concurrent programming
- Computer architecture