parser-identification.md (1033B)
1 +++ 2 title = 'Parser identification in embedded systems' 3 +++ 4 5 # Parser identification in embedded systems 6 Data flow graph: graph representing data dependencies between operations 7 - directed graph 8 - fire when input data are ready, consume data from input ports and produce data to output ports 9 10 Create DFG: transform code to SSA (static single assignment), then draw graph from that. you get a partial ordering of operations. 11 12 Solution: 13 1. Transform binary code to another representation for better reasoning 14 - LLVM intermediate language: data flow tracking is possible, SSA form 15 - recursive disassembler 16 - data flow normalization: model memory as array, replace QEMU load/store with LLVM load/store, detect access to stack and make them SSA 17 2. Use heuristics to select best candidates for parser functions 18 - compute score according to each feature 19 - combine (weighted sum) scores to single score, per function 20 - heuristics: looped switch statement, data flow analysis on conditional statements, and others 21 22 23 24