死锁的应对

回顾：死锁产生的必要条件

System deadlocks (1971)：死锁产生的四个必要条件

用 “资源” 来描述
- 状态机视角：就是 “当前状态下持有的锁 (校园卡/球)”

Mutual-exclusion - 一张校园卡只能被一个人拥有
Wait-for - 一个人等其他校园卡时，不会释放已有的校园卡
No-preemption - 不能抢夺他人的校园卡
Circular-chain - 形成校园卡的循环等待关系

应对死锁：死锁产生的必要条件 (cont'd)

站着说话不腰疼的教科书：

“理解了死锁的原因，尤其是产生死锁的四个必要条件，就可以最大可能地避免、预防和解除死锁。所以，在系统设计、进程调度等方面注意如何不让这四个必要条件成立，如何确定资源的合理分配算法，避免进程永久占据系统资源。此外，也要防止进程在处于等待状态的情况下占用资源。因此，对资源的分配要给予合理的规划。”

不能称为是一个合理的 argument

对于玩具系统/模型
- 我们可以直接证明系统是 deadlock-free 的
对于真正的复杂系统
- Bullshit

如何在实际系统中避免死锁？

四个条件中最容易达成的

避免循环等待

Lock ordering

任意时刻系统中的锁都是有限的
严格按照固定的顺序获得所有锁 (Lock Ordering)，就可以消灭循环等待
- “在任意时刻获得 “最靠后” 锁的线程总是可以继续执行”
例子：修复哲学家吃饭问题

Lock Ordering: 应用 (Linux Kernel: rmap.c)

Emmm……

Textbooks will tell you that if you always lock in the same order, you will never get this kind of deadlock. Practice will tell you that this approach doesn't scale: when I create a new lock, I don't understand enough of the kernel to figure out where in the 5000 lock hierarchy it will fit.

The best locks are encapsulated: they never get exposed in headers, and are never held around calls to non-trivial functions outside the same file. You can read through this code and see that it will never deadlock, because it never tries to grab another lock while it has that one. People using your code don't even need to know you are using a lock.

—— Unreliable Guide to Locking by Rusty Russell

最终犯错的还是人