Resilience of Mutual Exclusion Algorithms to Transient Memory Faults

Background

PODC is a (if not the) major conference in the area of distributed algorithms. At this year's PODC 2011, to be held in San Jose, California, next month, a paper will be presented by Thomas Moscibroda (Microsoft) and Rotem Oshman (MIT) on Resilience of Mutual Exclusion Algorithms to Transient Memory Faults. The issue addressed in this paper is that existing mutual exclusion algorithms, such as the algorithms of Peterson and Dekker (that we discussed in the operating systems course), are not resilient to transient memory faults. Amongst others, the authors present a 2-process mutual exclusion algorithm which uses three binary registers and tolerates a single memory fault. Also, by replacing one read/write register with a test&set register, they claim to obtain a mutual exclusion algorithm that can tolerate runs in which one variable experiences unboundedly many faults. The authors claim they have verified the first algorithm using the nuSMV2 model checker.

Goal

The goal of this homework assignment is to study and understand the paper of Moscibroda and Oshman, and to check as many results as possible using nuSMV2. In particular, I want you to check Theorem 5.1 using nuSMV2. Below some other challenges are listed. Please note that only in exceptional cases I will give a grade larger than 10. It is ok to work on this assignment in groups of up to 3 students.

Problem 1 (6 points)

Check Theorem 5.1 using nuSMV2.

Problem 2 (2 points)

Check Theorem 7.1 using nuSMV2.

Problem 3 (2 points)

Prove Theorem 4.1 (by hand).

Problem 4 (2 points)

Use nuSMV2 to check instances of Corollary 1.

Problem 5 (2 points)

Use nuSMV2 to check instances of Corollary 2.

Problem 6 (bonus)

Suggests improvemements to the paper or solve one of the open problems.

Reflection

Did you find your efforts to formally model and reason about this protocol useful? Did it help you to gain additional insight in the protocol? If so, what did you learn? What was the most difficult part of this assignment? How sure are you, after your efforts, that the Moscibroda & Oshman algorithms are correct? If you have not yet reached the point where you are 100% sure about the correctness, what additional efforts would be needed to get there?

Products

Your report + nuSMV2 models should be sent in by June 14, 2011.