Автор: Sahil Dhoked, Wojciech Golab, Neeraj Mittal
Издательство: Springer
Год: 2023
Страниц: 132
Язык: английский
Формат: pdf (true), epub
Размер: 20.8 MB
This book discusses the recent research work on designing efficient fault-tolerant synchronization mechanisms for concurrent processes using the relatively new persistent memory technology that combines the low latency benefits of DRAM with the persistence of magnetic disks. The authors include all of the major contributions published to date, and also convey some perspective regarding how the problem itself is evolving. The results are described at a high level to enable readers to gain a quick and thorough understanding of the RME problem and its nuances, as well as various solutions that have been designed to solve the problem under a variety of important conditions and how they compare to each other.
Inspired by the new possibilities offered by multiprocessor architectures equipped with persistent memory, Golab and Ramaraju recently formalized a fault-tolerant variation of the classic mutual exclusion problem, called Recoverable Mutual Exclusion (RME). Their conceptual model abstracts away many of the low-level technicalities in earlier practitioner-oriented work on fault-tolerant locks, such as how failures are detected and how a stuck critical section is reclaimed, using a simple but powerful assumption: a process that crashes while accessing a recoverable lock must eventually recover and make another attempt to acquire and release the lock.
Generally, algorithms for mutual exclusion are designed with the assumption that failures do not occur at inopportune times, such as while a process is accessing a lock or a shared resource. However, such failures can occur in the real world. A power outage or network failure might create an unrecoverable situation causing processes to stall or enter an erroneous state. Traditional mutual exclusion algorithms, which are not designed to operate properly in the presence of failures, may fail to guarantee vital correctness properties under such adverse conditions. For example, deadlock could arise if a failure occurs while some process is in the critical section of a lock, leading to potentially disastrous consequences for users of a mission-critical system. This observation gives rise to the recoverable mutual exclusion (RME) problem. The RME problem involves designing an algorithm that ensures mutual exclusion, along with other necessary correctness properties, under the assumption that process failures may occur at any point during their execution, but the system is able to resurrect failed processes to facilitate recovery.
Contents:
Скачать Recoverable Mutual Exclusion