lectures.alex.balgavy.eu

Lecture notes from university.
git clone git://git.alex.balgavy.eu/lectures.alex.balgavy.eu.git
Log | Files | Refs | Submodules

commit e90b27ffbbd6175653ba0f6de364f01a5b1d806f
parent d664fc7b2fd074de215a4629131d2edf65dd92a3
Author: Alex Balgavy <alex@balgavy.eu>
Date:   Thu, 13 May 2021 14:12:10 +0200

Update bama notes

Diffstat:
Mcontent/binary-malware-analysis-notes/_index.md | 1+
Acontent/binary-malware-analysis-notes/symbolic-execution/index.md | 49+++++++++++++++++++++++++++++++++++++++++++++++++
Acontent/binary-malware-analysis-notes/symbolic-execution/symbolic-execution-example.png | 0
3 files changed, 50 insertions(+), 0 deletions(-)

diff --git a/content/binary-malware-analysis-notes/_index.md b/content/binary-malware-analysis-notes/_index.md @@ -17,3 +17,4 @@ title = 'Binary and Malware Analysis' 12. [Tracking control flow](tracking-control-flow) 13. [Mitigating code reuse attacks (TypeArmor)](mitigating-code-reuse-attacks) 14. [Parser identification](parser-identification) +15. [Symbolic execution](symbolic-execution) diff --git a/content/binary-malware-analysis-notes/symbolic-execution/index.md b/content/binary-malware-analysis-notes/symbolic-execution/index.md @@ -0,0 +1,49 @@ ++++ +title = 'Symbolic execution' ++++ + +# Symbolic execution +Execute programs with symbols instead of values +- program state (memory, registers, etc.) is symbolic and bound by constraints + - have set of possible values + - subject to constraints, i.e. mathematical restrictions on allowed values + - symbolic variables/expressions are immutable +- perform all arithmetic & boolean operations symbolically + - output symbolic expressions encoding operation performed +- fork state when branch depends on symbol + - unconditional branch -- direct control flow + - concrete conditional branch -- follow condition + - symbolic conditional branch -- take both paths + - fork program state and set IP to either path + - add appropriate branch condition as path constraint + - eventually explores all possible paths + + ![Symbolic execution example](symbolic-execution-example.png) + +- solve constraints to get values when needed ('concretization') + +Problems with symbolic execution: +- state explosion: number of possible paths in a program is huge, increases with number of branches +- dealing with I/O: do nothing? output? +- modelling environment and hardware (and simulated hardware isn't the same) +- constraint solving: not all constraints are linear, NP-complete + +Basic optimisations: +- expression simplification: use algebra to bring down to minimal form +- state merging: mitigate state explosion when little difference between states + +Concolic (concrete + symbolic) execution: +- repeat: + - dynamic analysis to capture trace for concrete run + - symbolic execution on traced instructions, yielding path constraints + - dynamic analysis to capture trace for concrete run + - symbolic execution on traced instructions, yielding path constraints + - adjust constraints by flipping last encountered symbolic branch + - solve constraints of flipped branch, yielding inputs for new path +- advantages: no state explosion so much faster, symbolize only what you need +- drawbacks: need good initial input, can only flip branches controlled by symbolic variables, constraint solving is still hard + +SMT solvers: Z3, CVC4, STP + +SymbEx engines: KLEE, S2E, Angr, Triton + diff --git a/content/binary-malware-analysis-notes/symbolic-execution/symbolic-execution-example.png b/content/binary-malware-analysis-notes/symbolic-execution/symbolic-execution-example.png Binary files differ.