lectures.alex.balgavy.eu

Lecture notes from university.
git clone git://git.alex.balgavy.eu/lectures.alex.balgavy.eu.git
Log | Files | Refs | Submodules

parser-identification.md (1033B)


      1 +++
      2 title = 'Parser identification in embedded systems'
      3 +++
      4 
      5 # Parser identification in embedded systems
      6 Data flow graph: graph representing data dependencies between operations
      7 - directed graph
      8 - fire when input data are ready, consume data from input ports and produce data to output ports
      9 
     10 Create DFG: transform code to SSA (static single assignment), then draw graph from that. you get a partial ordering of operations.
     11 
     12 Solution:
     13 1. Transform binary code to another representation for better reasoning
     14     - LLVM intermediate language: data flow tracking is possible, SSA form
     15     - recursive disassembler
     16     - data flow normalization: model memory as array, replace QEMU load/store with LLVM load/store, detect access to stack and make them SSA
     17 2. Use heuristics to select best candidates for parser functions
     18     - compute score according to each feature
     19     - combine (weighted sum) scores to single score, per function
     20     - heuristics: looped switch statement, data flow analysis on conditional statements, and others
     21 
     22 
     23 
     24