lectures.alex.balgavy.eu

Lecture notes from university.
git clone git://git.alex.balgavy.eu/lectures.alex.balgavy.eu.git
Log | Files | Refs | Submodules

firmware-analysis-rehosting.md (3054B)


      1 +++
      2 title = 'Firmware analysis & rehosting'
      3 +++
      4 # Firmware analysis & rehosting
      5 Why is firmware analysis hard?
      6 - platform variety: different ISAs, file formats, peripherals, and not enough docs
      7 - fault detection: finding vulnerabilities often means to find a crash, and on desktop you get segfaults/canaries, but maybe not on embedded systems
      8 - scalability: embedded devices are slow, and for testing you may need high execution speed and fast resets
      9 - instrumentation: source-based often infeasible, binary-based may assume specific ISA and OS, maybe not enough storage to store modified firmware on flash
     10 
     11 ## Rehosting
     12 Rehosting: migrating firmware from original hardware environment into virtual environment
     13 
     14 Tools:
     15 - Unicorn: QEMU-based CPU emulation framework, bindings for Python
     16 - avatar2: multi-target (emulators/tools) orchestration system, focus on firmware analysis and rehosting
     17 
     18 Hardware-in-the-loop rehosting: forward hardware interaction to device (usually requires debugging ports or stubs)
     19 - Avatar: early HITL frameworks, requires JTAG or injectable debug stub. Separates execution and memory, forwards peripheral interaction.
     20 - Surrogates: high-performant HITL, eliminates performance issues of avatar, requires very specialized hardware
     21 
     22 Hardware-less rehosting: semi-automatically create hardware models, completely eliminate device dependency (higher likelihood for inaccuracies)
     23 - Pretender: based on avatar2, HITL rehosting only during learning, and hardware models after.
     24 - PartEMU: rehosting TrustZone Operation Systems, based on QEMU, manual mapping of registers to pattern-based models
     25 - P2IM: for Type-2/3 firmware, uses heuristics to identify register types, instantiate model based on register type
     26 - HALucinator: identify HAL functions -- hardware accesses typically go through hardware abstraction layers (HAL)
     27 - Fuzzware: every MMIO access is fuzzing input, uses dynamic symbolic execution to reduce input space
     28 - Firmadyne: unpacks Linux-based firmware and extracts file-system, runs it in QEMU with custom kernel
     29 
     30 ## Large-scale analysis
     31 Retrieve many firmware samples, try to find vulnerabilities in some of them.
     32 
     33 Identified problems:
     34 - hardcoded passwords
     35 - private certs included in firmware updates
     36 - backdoors in plain sight
     37 - known exploits often reusable, even across firmware from different vendors -> code-reuse
     38 - insecure usage of crypto libs is common
     39 - many bugs on Linux-based firmware are based on multi-binary interactions
     40 - insecure usage of Bluetooth Low Energy is very common for bare-metal IoT devices
     41 
     42 Approaches:
     43 - '14: unpacking based on Binary Analysis Toolkit, only static analysis
     44 - Firmadyne: unpacking based on binwalk, emulation/rehosting approach
     45 - CryptoREX: unpacking based on binwalk, tried to discover violation of cryptographic 'rules', taint analysis with angr
     46 - Karonte: multi-binary data-flow analysis -- identify "border binaries", track dataflow to unsafe functions
     47 - FirmXRy: extract firmware from APKs, automated base address recognition, vulnerability detection via policies