Design methodologies of dependable computer systems are becoming more important because recent computer systems are highly complex, and thus designing dependable systems is becoming more challenging. In this course, students will learn basic concepts of dependable computing, such as fault avoidance, fault tolerance, static and dynamic masks of faults, and dependability calculation methods. As elemental technologies of dependable computer system construction, students will also learn circuit testing methods to detect faults in electronic circuits, error control codes to detect and correct errors in data, dependable techniques for distributed storage systems and networks, security, and cryptography.
The aim of this course is to provide a systematized knowledge of dependable techniques for computer systems and give elemental techniques for improving system dependability. Based on these, students will acquire abilities to design and construct dependable computer systems.
At the end of this course, students will be able to
1) Have an understanding of basic concepts of dependable computing, such as fault, error, and failure.
2) Have an understanding of reliability metrics, such as failure rate, reliability, and mean-time-to-failure, and acquire the ability to solve basic problems of reliability calculation.
3) Have an understanding of basic techniques for fault-tolerant systems, namely, static/dynamic mask and fail-safe.
4) Have an understanding of fault-tolerant algorithms for distributed systems.
5) Have an understanding of error control coding techniques, such as bit error control codes, BCH code, Reed-Solomon code, low-density parity-check code, product code, and interleaving.
6) Have an understanding of design methods of dependable electronic circuits, such as electronic circuit test and design for testability.
dependable system, fault tolerance, circuit testing, distributed algorithm, error control code
✔ Specialist skills | Intercultural skills | Communication skills | Critical thinking skills | Practical and/or problem-solving skills |
Lecture handout will be uploaded to OCWi before each class.
Towards the end of class, students are given exercise problems related to what is taught on that day to solve.
Answers to the exercise problems are given at the beginning of the next class.
Course schedule | Required learning | |
---|---|---|
Class 1 | Fault, error, failure, dependability, and reliability | Definitions of fault, error, and failure. Calculation methods of failure rate, reliability, and MTTF. |
Class 2 | Dependable design by redundancy | Static mask, TMR, and nMR. |
Class 3 | System reconstruction and recovery techniques, and fail-safe techniques | Dynamic mask, check-pointing, and fail -safe. |
Class 4 | Circuit testing: test generation (D-algorithm) | D-algorithm (circuit testing) |
Class 5 | Circuit testing: test generation (PODEM) | PODEM (circuit testing) |
Class 6 | Circuit testing: testability, fault simulation, design for testability | Testability, fault simulation, scan design, built-in self-test |
Class 7 | Error control codes: Galois field, linear space, minimum distance | prime field, linear space, minimum distance, generator matrix, parity-check matrix |
Class 8 | Error control codes for data bus systems | cyclic code, generator polynomial, parity-check polynomial |
Class 9 | Error control codes for SRAM/DRAM/Flash memories | Hamming code, odd-weight-column SEC-DED code, BCH code |
Class 10 | Error control codes for optical discs | Reed-Solomon code, product code, concatenated code |
Class 11 | Error control codes for magnetic disks and disk arrays | Fire code, interleaving, low-density parity-check code, EVENODD code |
Class 12 | Dependability of distributed systems: Byzantine general problem | Algorithm for Byzantine general problem |
Class 13 | Dependability of distributed systems: dependable storages/networks | Techniques for dependable storage and network |
Class 14 | Dependability of distributed systems: security and cryptography | Symmetric key cryptography, public key cryptography, RSA, AES |
To enhance effective learning, students are encouraged to spend approximately 100 minutes preparing for class and another 100 minutes reviewing class content afterwards (including assignments) for each class.
They should do so by referring to textbooks and other course material.
Non required
(Course materials are provided via OCWi)
Fujiwara, Eiji. Code Design for Dependable Systems. Wiley-InterScience, ISBN: 978-0471756187.
Yoneda, Tomohiro et al., Dependable systems, Tokyo: Kyoritsu-Syuppan, ISBN: 978-4320121522. (Japanese)
Students will be assessed on their understanding of fundamental theories of dependable computing, circuit testing methods, error control codings, and dependability methods for distributed systems.
Student's course scores are assessed based on reports (around four times).
No prerequisites are necessary, but enrollment in the related courses is desirable.