死锁的避免

死锁的避免

2024 南京大学《操作系统:设计与实现》
死锁的避免

死锁:一类 “简单” 的并发 Bug

具有明确的 Specification

  • 任何线程在 “基本合理” 的调度下,不能失去进展

甚至有明确的必要条件

  1. Mutual-exclusion - 一个口袋一个球,得到球才能继续
  2. Wait-for - 得到球的人想要更多的球
  3. No-preemption - 不能抢别人的持有的球
  4. Circular-chain - 形成循环等待球的关系

Lock ordering: 避免循环等待

  • 严格按照编号顺序获得所有锁
2024 南京大学《操作系统:设计与实现》
死锁的避免

Lock Ordering: 应用

Linux Kernel: mm/rmap.c

center

2024 南京大学《操作系统:设计与实现》
死锁的避免

然而……

Unreliable Guide to Locking: Textbooks will tell you that if you always lock in the same order, you will never get this kind of deadlock. Practice will tell you that this approach doesn't scale: when I create a new lock, I don't understand enough of the kernel to figure out where in the 5000 lock hierarchy it will fit.

The best locks are encapsulated: they never get exposed in headers, and are never held around calls to non-trivial functions outside the same file. You can read through this code and see that it will never deadlock, because it never tries to grab another lock while it has that one. People using your code don't even need to know you are using a lock.

2024 南京大学《操作系统:设计与实现》
死锁的避免

死锁:死局

一面是复杂的系统,另一面是不可靠的人

  • 希望
    • 标记 “做一件事” 不被打断
  • 实际
    • “做一件事” 需要拆解成多个步骤
    • 每个步骤需要上正确 (而且尽可能少) 的锁

LockDoc (EuroSys'19)

  • “Only 53 percent of the variables with a documented locking rule are actually consistently accessed with the required locks held.”
2024 南京大学《操作系统:设计与实现》