commit 8ff598305663210e76f15927c190ecf55859d164
parent dd31f0b32bc7c550a195f94815e55674380614cd
Author: Alex Balgavy <alex@balgavy.eu>
Date: Thu, 8 Apr 2021 11:10:40 +0200
Update BAMA notes
Diffstat:
3 files changed, 55 insertions(+), 0 deletions(-)
diff --git a/content/binary-malware-analysis-notes/_index.md b/content/binary-malware-analysis-notes/_index.md
@@ -10,3 +10,4 @@ title = 'Binary and Malware Analysis'
5. [Anti-analysis](anti-analysis)
6. [Disassembly tools](disassembly-tools)
7. [Packers](packers)
+8. [Dynamic Binary Instrumentation & Intel Pin](dynamic-binary-instrumentation-and-intel-pin)
diff --git a/content/binary-malware-analysis-notes/dynamic-binary-instrumentation-and-intel-pin/index.md b/content/binary-malware-analysis-notes/dynamic-binary-instrumentation-and-intel-pin/index.md
@@ -0,0 +1,54 @@
++++
+title = 'Dynamic Binary Instrumentation & Intel Pin'
++++
+
+# Dynamic Binary Instrumentation & Intel Pin
+Full system emulation is powerful (full system visibility) but invasive (full system runs emulated).
+
+Dynamic binary instrumentation (DBI) gives you binary-level visibility, and is efficient and self-contained.
+
+Instrumentation: technique injecting code into binary to collect runtime info
+- executes as part of normal instruction stream
+- doesn't modify semantics of program
+
+Instrumentation is good
+- optimisation/profiling: instruction profiling, basic block count
+- bug detection/exploit generation: find references to uninitialized addresses, inspect arguments at particular function call, inspect function pointers and return addresses, record & replay
+- architectural research: processor and cache simulation, trace collection
+
+Two classes
+- static: instrument before runtime (source code, IR, binary)
+- dynamic: at runtime (just in time, e.g. Pin, Valgrind, QEMU)
+ - no need to recompile or relink, discover code at runtime, handles generated code, attaches to running processes
+ - but: higher performance overhead, needs framework which malware can detect
+
+Why binary instrumentation:
+- libraries are a pain for source/IR instrumentation (e.g. proprietary)
+- easily handles multilingual programs
+- with malware you rarely get source code
+
+## Intel Pin ([website](http://pintool.intel.com/))
+DBI framework, can insert arbitrary code in arbitrary places in executable.
+
+Can examine any type of instruction, track function calls including library and syscalls, track application threads, etc.
+
+![Pin architecture diagram](pin-diagram.png)
+
+Instrumentation vs analysis:
+- instrumentation routines: define where instrumentation inserted, invoked when instruction being JITted
+- analysis routines: define what to do when instrumentation activated, invoked every time instruction is executed
+
+Using it:
+1. `export PIN_HOME=/path/to/pin/directory && make`
+2. `pin -t /path/to/pin/code.so -- /path/to/binary`
+
+Reducing Pin overhead:
+- shift computation from analysis routines to instrumentation routines when possible
+- instrument at largest granularity whenever possible (e.g. one call per basic block or trace)
+- reduce number of arguments to analysis routines
+- inline functions where possible (do `pin --log-inline` and look in pin.log)
+
+Debugging Pin:
+1. Run Pin with `-appdebug`
+2. Start GDB and run `target remote :<number given by pin>`
+3. Use GDB normally
diff --git a/content/binary-malware-analysis-notes/dynamic-binary-instrumentation-and-intel-pin/pin-diagram.png b/content/binary-malware-analysis-notes/dynamic-binary-instrumentation-and-intel-pin/pin-diagram.png
Binary files differ.