dynamic-taint-analysis.md (3053B)
1 +++ 2 title = 'Dynamic taint analysis' 3 +++ 4 5 # Dynamic taint analysis 6 "Tracking interesting things" 7 8 Idea: 9 - label info with tags (trusted/untrusted, interesting/boring, public/secret) 10 - control how data and labels propagate: 11 - when copying data, also copy flag 12 - clean a tag when you know the associated data is no longer "untrusted" 13 - policies to check for interesting/unsafe usage of tainted data 14 15 Access policies: 16 - Preventing leakage of classified data 17 - Bell-LaPadula: no read up, no write down 18 - Preserve integrity 19 - Biba: no read down, no write up 20 21 Tainting to detect attacks 22 - taint all data from network as tainted 23 - check whether tainted values influence control flow 24 - raise alert when a return instruction is executed with tainted address 25 - also raise alert on 26 - other calls/jumps made with tainted addresses 27 - calls, rets, jumps that are made to tainted instructions 28 29 For exploits: 30 - let's say you have arbitrary write 31 - taint all data in memory, then observe whether tainted data makes it to argument of stuff like execve 32 33 Questions for tainting: 34 1. what to taint? 35 2. how to propagate taint, and how to clean it? 36 3. how to use taint? 37 4. track bits, bytes, words, blocks...in single color or multiple colors? 38 5. tainting boundaries -- only registers, or also memory? what about disk? 39 40 ## What to taint 41 For control of information flow, taint everything. 42 43 For attack detection, taint everything from untrusted source, and see if it ends up where it shouldn't. 44 45 For binary analysis, taint anything possible, like data typed by user and config files. 46 47 For privacy breaches: taint privacy sensitive data, like passwords and credit card number. 48 49 For vulnerability detection, taint everything that attacker can control. 50 51 ## Taint propagation 52 Generally, these rules hold: 53 - untainted + untainted = untainted 54 - untainted + tainted = tainted 55 - tainted + tainted = tainted 56 57 Cleaning the taint: 58 - when storing constant in a destination 59 - maybe with MMX or floating point instructions 60 61 Propagating: 62 - on direct moves 63 - maybe on arithmetic operations 64 - what about implicit flows (variable that's determined by tainted value, but not set directly) 65 - in attack detection, mostly not, which works OK 66 - in leakage detection, also not, but this is not fine because malware can launder taint and escape detection 67 - what about pointer tainting -- e.g. if you use a tainted value to index a toupper() table, it's the result is tainted, but it might not for some other table 68 69 ## Using the taint 70 - check whether address in ret/jump/call is not tainted (cannot deal with non-control flow diverting attacks) 71 - mark secret data as tainted, keep track of it to see whether it doesn't leave the app (but do we propagate on pointers and implicit flows?) 72 - reverse engineer structure of config/input file, mark as tainted, monitor arguments of syscalls for file names and IPs, and monitor arguments of strcmp 73 - taint all writable data in memory, observe whether tainted data makes it to arguments of syscalls like execve