commit e90b27ffbbd6175653ba0f6de364f01a5b1d806f
parent d664fc7b2fd074de215a4629131d2edf65dd92a3
Author: Alex Balgavy <alex@balgavy.eu>
Date: Thu, 13 May 2021 14:12:10 +0200
Update bama notes
Diffstat:
3 files changed, 50 insertions(+), 0 deletions(-)
diff --git a/content/binary-malware-analysis-notes/_index.md b/content/binary-malware-analysis-notes/_index.md
@@ -17,3 +17,4 @@ title = 'Binary and Malware Analysis'
12. [Tracking control flow](tracking-control-flow)
13. [Mitigating code reuse attacks (TypeArmor)](mitigating-code-reuse-attacks)
14. [Parser identification](parser-identification)
+15. [Symbolic execution](symbolic-execution)
diff --git a/content/binary-malware-analysis-notes/symbolic-execution/index.md b/content/binary-malware-analysis-notes/symbolic-execution/index.md
@@ -0,0 +1,49 @@
++++
+title = 'Symbolic execution'
++++
+
+# Symbolic execution
+Execute programs with symbols instead of values
+- program state (memory, registers, etc.) is symbolic and bound by constraints
+ - have set of possible values
+ - subject to constraints, i.e. mathematical restrictions on allowed values
+ - symbolic variables/expressions are immutable
+- perform all arithmetic & boolean operations symbolically
+ - output symbolic expressions encoding operation performed
+- fork state when branch depends on symbol
+ - unconditional branch -- direct control flow
+ - concrete conditional branch -- follow condition
+ - symbolic conditional branch -- take both paths
+ - fork program state and set IP to either path
+ - add appropriate branch condition as path constraint
+ - eventually explores all possible paths
+
+ ![Symbolic execution example](symbolic-execution-example.png)
+
+- solve constraints to get values when needed ('concretization')
+
+Problems with symbolic execution:
+- state explosion: number of possible paths in a program is huge, increases with number of branches
+- dealing with I/O: do nothing? output?
+- modelling environment and hardware (and simulated hardware isn't the same)
+- constraint solving: not all constraints are linear, NP-complete
+
+Basic optimisations:
+- expression simplification: use algebra to bring down to minimal form
+- state merging: mitigate state explosion when little difference between states
+
+Concolic (concrete + symbolic) execution:
+- repeat:
+ - dynamic analysis to capture trace for concrete run
+ - symbolic execution on traced instructions, yielding path constraints
+ - dynamic analysis to capture trace for concrete run
+ - symbolic execution on traced instructions, yielding path constraints
+ - adjust constraints by flipping last encountered symbolic branch
+ - solve constraints of flipped branch, yielding inputs for new path
+- advantages: no state explosion so much faster, symbolize only what you need
+- drawbacks: need good initial input, can only flip branches controlled by symbolic variables, constraint solving is still hard
+
+SMT solvers: Z3, CVC4, STP
+
+SymbEx engines: KLEE, S2E, Angr, Triton
+
diff --git a/content/binary-malware-analysis-notes/symbolic-execution/symbolic-execution-example.png b/content/binary-malware-analysis-notes/symbolic-execution/symbolic-execution-example.png
Binary files differ.