commit b43873fd1ab8d58426db34985e5098db1c79f7b8
parent 24728b61f7cb0b965b1839741225fa0c808e79c6
Author: Alex Balgavy <alex@balgavy.eu>
Date: Tue, 13 Apr 2021 10:54:35 +0200
Update BAMA notes
Diffstat:
2 files changed, 74 insertions(+), 0 deletions(-)
diff --git a/content/binary-malware-analysis-notes/_index.md b/content/binary-malware-analysis-notes/_index.md
@@ -11,3 +11,4 @@ title = 'Binary and Malware Analysis'
6. [Disassembly tools](disassembly-tools)
7. [Packers](packers)
8. [Dynamic Binary Instrumentation & Intel Pin](dynamic-binary-instrumentation-and-intel-pin)
+9. [Dynamic taint analysis](dynamic-taint-analysis)
diff --git a/content/binary-malware-analysis-notes/dynamic-taint-analysis.md b/content/binary-malware-analysis-notes/dynamic-taint-analysis.md
@@ -0,0 +1,73 @@
++++
+title = 'Dynamic taint analysis'
++++
+
+# Dynamic taint analysis
+"Tracking interesting things"
+
+Idea:
+- label info with tags (trusted/untrusted, interesting/boring, public/secret)
+- control how data and labels propagate:
+ - when copying data, also copy flag
+ - clean a tag when you know the associated data is no longer "untrusted"
+ - policies to check for interesting/unsafe usage of tainted data
+
+Access policies:
+- Preventing leakage of classified data
+ - Bell-LaPadula: no read up, no write down
+- Preserve integrity
+ - Biba: no read down, no write up
+
+Tainting to detect attacks
+- taint all data from network as tainted
+- check whether tainted values influence control flow
+ - raise alert when a return instruction is executed with tainted address
+ - also raise alert on
+ - other calls/jumps made with tainted addresses
+ - calls, rets, jumps that are made to tainted instructions
+
+For exploits:
+- let's say you have arbitrary write
+- taint all data in memory, then observe whether tainted data makes it to argument of stuff like execve
+
+Questions for tainting:
+1. what to taint?
+2. how to propagate taint, and how to clean it?
+3. how to use taint?
+4. track bits, bytes, words, blocks...in single color or multiple colors?
+5. tainting boundaries -- only registers, or also memory? what about disk?
+
+## What to taint
+For control of information flow, taint everything.
+
+For attack detection, taint everything from untrusted source, and see if it ends up where it shouldn't.
+
+For binary analysis, taint anything possible, like data typed by user and config files.
+
+For privacy breaches: taint privacy sensitive data, like passwords and credit card number.
+
+For vulnerability detection, taint everything that attacker can control.
+
+## Taint propagation
+Generally, these rules hold:
+- untainted + untainted = untainted
+- untainted + tainted = tainted
+- tainted + tainted = tainted
+
+Cleaning the taint:
+- when storing constant in a destination
+- maybe with MMX or floating point instructions
+
+Propagating:
+- on direct moves
+- maybe on arithmetic operations
+- what about implicit flows (variable that's determined by tainted value, but not set directly)
+ - in attack detection, mostly not, which works OK
+ - in leakage detection, also not, but this is not fine because malware can launder taint and escape detection
+- what about pointer tainting -- e.g. if you use a tainted value to index a toupper() table, it's the result is tainted, but it might not for some other table
+
+## Using the taint
+- check whether address in ret/jump/call is not tainted (cannot deal with non-control flow diverting attacks)
+- mark secret data as tainted, keep track of it to see whether it doesn't leave the app (but do we propagate on pointers and implicit flows?)
+- reverse engineer structure of config/input file, mark as tainted, monitor arguments of syscalls for file names and IPs, and monitor arguments of strcmp
+- taint all writable data in memory, observe whether tainted data makes it to arguments of syscalls like execve