很多人 (你们的同学们、家长们) 都有一个认识
但不应该是这样的!
理解 “I/O 设备是什么”
学习 I/O 设备在操作系统中的抽象
CPU 只是 “无情的指令执行机器”
说人话
IBM PC/AT 8042 PS/2 (Keyboard) Controller
0x60
(data), 0x64
(status/command)Command Byte | Use | 说明 |
---|---|---|
0xED | LED 灯控 | ScrollLock/NumLock/CapsLock |
0xF3 | 设置重复速度 | 30Hz - 2Hz; Delay: 250 - 1000ms |
0xF4/0xF5 | 打开/关闭 | N/A |
0xFE | 重新发送 | N/A |
0xFF | RESET | N/A |
参考 AbstractMachine 的键盘部分实现
ATA (Advanced Technology Attachment)
0x1f0 - 0x1f7
; secondary: 0x170 - 0x177
void readsect(void *dst, int sect) {
waitdisk();
out_byte(0x1f2, 1); // sector count (1)
out_byte(0x1f3, sect); // sector
out_byte(0x1f4, sect >> 8); // cylinder (low)
out_byte(0x1f5, sect >> 16); // cylinder (high)
out_byte(0x1f6, (sect >> 24) | 0xe0); // drive
out_byte(0x1f7, 0x20); // command (write)
waitdisk();
for (int i = 0; i < SECTSIZE / 4; i ++)
((uint32_t *)dst)[i] = in_long(0x1f0); // data
}
CPU 有一个中断引脚
系统中的其他设备可以向中断控制器连线
如果你只造 “一台计算机”
但如果你希望给未来留点空间?
提供
lspci -tv
和 lsusb -tv
: 查看系统中总线上的设备pci-probe.c (AbstractMachine, x86-64/i386)
-soundhw ac97
的运行选项for (int bus = 0; bus < 256; bus++)
for (int slot = 0; slot < 32; slot++) {
uint32_t info = pciconf_read(bus, slot, 0, 0);
uint16_t id = info >> 16, vendor = info & 0xffff;
if (vendor != 0xffff) {
printf("%02d:%02d device %x by vendor %x", bus, slot, id, vendor);
if (id == 0x100e && vendor == 0x8086) {
printf(" <-- This is an Intel e1000 NIC card!");
}
printf("\n");
}
}
假设程序希望写入 1 GB 的数据到磁盘
for (int i = 0; i < 1 GB / 4; i++) {
outl(PORT, ((u32 *)buf)[i]);
}
能否把 CPU 从执行循环中解放出来?
memcpy_to_port(ATA0, buf, length);
DMA: 一个专门执行 “memcpy
” 程序的 CPU
支持的几种 memcpy
PCI 总线支持 DMA
一切皆可计算
for (int i = 1; i <= H; i++) {
for (int j = 1; j <= W; j++)
putchar(j <= i ? '*' : ' ');
putchar('\n');
}
难办的是性能
76543210
||||||||
||||||++- Palette
|||+++--- Unimplemented
||+------ Priority
|+------- Flip horizontally
+-------- Flip vertically
CPU 只
一个完整的众核多处理器系统
class NeuralNetwork(nn.Module):
def __init__(self):
super(NeuralNetwork, self).__init__()
self.flatten = nn.Flatten()
self.linear_relu_stack = nn.Sequential(
nn.Linear(28*28, 512), nn.ReLU(),
nn.Linear(512, 512), nn.ReLU(),
nn.Linear(512, 10), nn.ReLU(), )
...
model = NeuralNetwork().to('cuda')
炼丹的基本数据 tensor (stride, sparse, ...)
tensor = torch.rand(128, 96)
无论何种 I/O 设备,都是可以读 (read) 写 (write) 的
字节序列 (流或数组) 。
操作系统:设备 = 支持以下三种操作的对象 (文件)
把对设备的读/写/ioctl 系统调用 “翻译” 成设备的寄存器命令序列
例子:/dev/
中的对象
/dev/pts/[x]
- pseudo terminal/dev/zero
- “零” 设备/dev/null
- “null” 设备/dev/random
, /dev/urandom
- 随机数生成器head -c 512 [device] | xxd
做了很多不合理的简化假设
typedef struct devops {
int (*init)(device_t *dev);
int (*read) (device_t *dev, int offset, void *buf, int count);
int (*write)(device_t *dev, int offset, void *buf, int count);
} devops_t;
设备驱动程序:将设备抽象为一个对象和操作
/dev/null
, /dev/urandom
, ...不就是把 read/write/ioctl 翻译成设备认识的指令吗?
难点
struct file_operations {
struct module *owner;
loff_t (*llseek) (struct file *, loff_t, int);
ssize_t (*read) (struct file *, char __user *, size_t, loff_t *);
ssize_t (*write) (struct file *, const char __user *, size_t, loff_t *);
int (*mmap) (struct file *, struct vm_area_struct *);
unsigned long mmap_supported_flags;
int (*open) (struct inode *, struct file *);
int (*release) (struct inode *, struct file *);
int (*flush) (struct file *, fl_owner_t id);
int (*fsync) (struct file *, loff_t, loff_t, int datasync);
int (*lock) (struct file *, int, struct file_lock *);
ssize_t (*sendpage) (struct file *, struct page *, int, size_t, loff_t *, int);
long (*unlocked_ioctl) (struct file *, unsigned int, unsigned long);
long (*compat_ioctl) (struct file *, unsigned int, unsigned long);
int (*flock) (struct file *, int, struct file_lock *);
...
long (*unlocked_ioctl) (struct file *, unsigned int, unsigned long);
long (*compat_ioctl) (struct file *, unsigned int, unsigned long);
unlocked_ioctl
: BKL (Big Kernel Lock) 时代的遗产ioctl
ioctl
执行时默认持有 BKLunlocked_ioctl
避免锁ioctl
从 struct file_operations
中移除compact_ioctl
: 机器字长的兼容性磁盘 (存储设备) 的访问特性
对比一下终端和 GPU,的确是很不一样的设备
文件系统和磁盘设备之间的接口
Many storage devices, ... come with volatile write back caches
我们当然可以提供一个 ioctl
| REQ_PREFLUSH
之前的数据落盘后才开始| REQ_FUA
(force unit access),数据落盘后才返回对磁盘做了一个不同的抽象
文件系统:在 bread/bwrite/bflush 的基础上创建一个数据结构
本次课内容与目标
Takeaway messages