Yanyan Jiang
Static analysis
Dynamic analysis
Computer system = state machine of (memory, registers) whose running is driven by instructions.
(Because computer systems are simply circuits.)
This model works for
syscall
is a special non-deterministic instruction)A program that monitors and alters program execution to produce useful results.
That is, a function $f(\tau)$ to produce useful results given the execution trace $\tau$ of a state machine (program/computer system).
GDB, the GNU Project debugger, allows you to see what is going on “inside” another program while it executes -- or what another program was doing at the moment it crashed.
Lots of commands
r
, c
, f
, s
, si
,...b
, hb
, wa
, ...p
, x
, i
, bt
, ...set
, ...rc
, rn
, rsi
, ...Suffices for anything
The fundamental problem: how to pause program execution at an instruction (address) or statement?
Dynamic program instrumentation
int $3
(0xcc
for x86) or ebreak
(for risc-v)Any practical dynamic analysis is a “simplified” (and more efficient) debugger.
Virtually, we can do any observation or perturbation on a debugger
info inferiors
; thread 1
; info registers
; x/i $rip
set var = value
But single-step execution incurs
How to implement
lightweight logging andefficient analysis for a specificSE research task ?
Problem space
Design space
We don't need every memory/register snapshots on each instruction for a deterministic replay.
We only need to record non-determinism outcomes
Record even less to see which parts took the most time.
Premature optimization is the root of all evil (D. E. Knuth)
Invariant Mining
Useful in many scenarios!
Online monitoring of predefined bug patterns
Signed integer overflow is undefined behavior
Add a check on each signed integer operation
foo(i++, j++) + 1
→ ADD(foo(INC(i), INC(j)), 1)
Dynamic analyses can also
(For SE tasks.)
Implementations
Read (a lot of) papers.