firmware-analysis-rehosting.md (3054B)
1 +++ 2 title = 'Firmware analysis & rehosting' 3 +++ 4 # Firmware analysis & rehosting 5 Why is firmware analysis hard? 6 - platform variety: different ISAs, file formats, peripherals, and not enough docs 7 - fault detection: finding vulnerabilities often means to find a crash, and on desktop you get segfaults/canaries, but maybe not on embedded systems 8 - scalability: embedded devices are slow, and for testing you may need high execution speed and fast resets 9 - instrumentation: source-based often infeasible, binary-based may assume specific ISA and OS, maybe not enough storage to store modified firmware on flash 10 11 ## Rehosting 12 Rehosting: migrating firmware from original hardware environment into virtual environment 13 14 Tools: 15 - Unicorn: QEMU-based CPU emulation framework, bindings for Python 16 - avatar2: multi-target (emulators/tools) orchestration system, focus on firmware analysis and rehosting 17 18 Hardware-in-the-loop rehosting: forward hardware interaction to device (usually requires debugging ports or stubs) 19 - Avatar: early HITL frameworks, requires JTAG or injectable debug stub. Separates execution and memory, forwards peripheral interaction. 20 - Surrogates: high-performant HITL, eliminates performance issues of avatar, requires very specialized hardware 21 22 Hardware-less rehosting: semi-automatically create hardware models, completely eliminate device dependency (higher likelihood for inaccuracies) 23 - Pretender: based on avatar2, HITL rehosting only during learning, and hardware models after. 24 - PartEMU: rehosting TrustZone Operation Systems, based on QEMU, manual mapping of registers to pattern-based models 25 - P2IM: for Type-2/3 firmware, uses heuristics to identify register types, instantiate model based on register type 26 - HALucinator: identify HAL functions -- hardware accesses typically go through hardware abstraction layers (HAL) 27 - Fuzzware: every MMIO access is fuzzing input, uses dynamic symbolic execution to reduce input space 28 - Firmadyne: unpacks Linux-based firmware and extracts file-system, runs it in QEMU with custom kernel 29 30 ## Large-scale analysis 31 Retrieve many firmware samples, try to find vulnerabilities in some of them. 32 33 Identified problems: 34 - hardcoded passwords 35 - private certs included in firmware updates 36 - backdoors in plain sight 37 - known exploits often reusable, even across firmware from different vendors -> code-reuse 38 - insecure usage of crypto libs is common 39 - many bugs on Linux-based firmware are based on multi-binary interactions 40 - insecure usage of Bluetooth Low Energy is very common for bare-metal IoT devices 41 42 Approaches: 43 - '14: unpacking based on Binary Analysis Toolkit, only static analysis 44 - Firmadyne: unpacking based on binwalk, emulation/rehosting approach 45 - CryptoREX: unpacking based on binwalk, tried to discover violation of cryptographic 'rules', taint analysis with angr 46 - Karonte: multi-binary data-flow analysis -- identify "border binaries", track dataflow to unsafe functions 47 - FirmXRy: extract firmware from APKs, automated base address recognition, vulnerability detection via policies