lectures.alex.balgavy.eu

Lecture notes from university.
git clone git://git.alex.balgavy.eu/lectures.alex.balgavy.eu.git
Log | Files | Refs | Submodules

dynamic-taint-analysis.md (3053B)


      1 +++
      2 title = 'Dynamic taint analysis'
      3 +++
      4 
      5 # Dynamic taint analysis
      6 "Tracking interesting things"
      7 
      8 Idea:
      9 - label info with tags (trusted/untrusted, interesting/boring, public/secret)
     10 - control how data and labels propagate:
     11     - when copying data, also copy flag
     12     - clean a tag when you know the associated data is no longer "untrusted"
     13     - policies to check for interesting/unsafe usage of tainted data
     14 
     15 Access policies:
     16 - Preventing leakage of classified data
     17     - Bell-LaPadula: no read up, no write down
     18 - Preserve integrity
     19     - Biba: no read down, no write up
     20 
     21 Tainting to detect attacks
     22 - taint all data from network as tainted
     23 - check whether tainted values influence control flow
     24     - raise alert when a return instruction is executed with tainted address
     25     - also raise alert on
     26         - other calls/jumps made with tainted addresses
     27         - calls, rets, jumps that are made to tainted instructions
     28 
     29 For exploits:
     30 - let's say you have arbitrary write
     31 - taint all data in memory, then observe whether tainted data makes it to argument of stuff like execve
     32 
     33 Questions for tainting:
     34 1. what to taint?
     35 2. how to propagate taint, and how to clean it?
     36 3. how to use taint?
     37 4. track bits, bytes, words, blocks...in single color or multiple colors?
     38 5. tainting boundaries -- only registers, or also memory? what about disk?
     39 
     40 ## What to taint
     41 For control of information flow, taint everything.
     42 
     43 For attack detection, taint everything from untrusted source, and see if it ends up where it shouldn't.
     44 
     45 For binary analysis, taint anything possible, like data typed by user and config files.
     46 
     47 For privacy breaches: taint privacy sensitive data, like passwords and credit card number.
     48 
     49 For vulnerability detection, taint everything that attacker can control.
     50 
     51 ## Taint propagation
     52 Generally, these rules hold:
     53 - untainted + untainted = untainted
     54 - untainted + tainted = tainted
     55 - tainted + tainted = tainted
     56 
     57 Cleaning the taint:
     58 - when storing constant in a destination
     59 - maybe with MMX or floating point instructions
     60 
     61 Propagating:
     62 - on direct moves
     63 - maybe on arithmetic operations
     64 - what about implicit flows (variable that's determined by tainted value, but not set directly)
     65     - in attack detection, mostly not, which works OK
     66     - in leakage detection, also not, but this is not fine because malware can launder taint and escape detection
     67 - what about pointer tainting -- e.g. if you use a tainted value to index a toupper() table, it's the result is tainted, but it might not for some other table
     68 
     69 ## Using the taint
     70 - check whether address in ret/jump/call is not tainted (cannot deal with non-control flow diverting attacks)
     71 - mark secret data as tainted, keep track of it to see whether it doesn't leave the app (but do we propagate on pointers and implicit flows?)
     72 - reverse engineer structure of config/input file, mark as tainted, monitor arguments of syscalls for file names and IPs, and monitor arguments of strcmp
     73 - taint all writable data in memory, observe whether tainted data makes it to arguments of syscalls like execve