commit d664fc7b2fd074de215a4629131d2edf65dd92a3
parent b91a8b48614aec0f2a0324584378a6ee9d16de6d
Author: Alex Balgavy <alex@balgavy.eu>
Date: Mon, 10 May 2021 14:48:06 +0200
Bama: new lecture notes
Diffstat:
3 files changed, 45 insertions(+), 0 deletions(-)
diff --git a/content/binary-malware-analysis-notes/_index.md b/content/binary-malware-analysis-notes/_index.md
@@ -15,3 +15,5 @@ title = 'Binary and Malware Analysis'
10. [Taint analysis in practice](taint-analysis-in-practice)
11. [Dynamic data excavation](dynamic-data-excavation)
12. [Tracking control flow](tracking-control-flow)
+13. [Mitigating code reuse attacks (TypeArmor)](mitigating-code-reuse-attacks)
+14. [Parser identification](parser-identification)
diff --git a/content/binary-malware-analysis-notes/mitigating-code-reuse-attacks.md b/content/binary-malware-analysis-notes/mitigating-code-reuse-attacks.md
@@ -0,0 +1,19 @@
++++
+title = 'Mitigating code-reuse attacks at the binary level'
++++
+
+# Mitigating code-reuse attacks at the binary level
+Control-flow integrity:
+- promising way to stop code-reuse attacks (using already existing code to do what you want)
+- hard to enforce in practice
+- existing binary-level CFI can't prevent function-reuse attacks
+
+A call to function pointer gives you an attacker-controlled gadget, especially if it's in a loop.
+- source-level CFI: enforce class hierarchy, match function argument types
+- TypeArmor for approximate source-level accuracy
+
+The idea of TypeArmor:
+- function signature matching: extract argument count at callsite, argument usage at callee. then only allow targets with matching function types
+- in a callee, argument registers being used before they are written
+- in a callsite, see which argument registers are set: on function entry point continue, on return edge stop
+- at runtime, check the number of arguments
diff --git a/content/binary-malware-analysis-notes/parser-identification.md b/content/binary-malware-analysis-notes/parser-identification.md
@@ -0,0 +1,24 @@
++++
+title = 'Parser identification in embedded systems'
++++
+
+# Parser identification in embedded systems
+Data flow graph: graph representing data dependencies between operations
+- directed graph
+- fire when input data are ready, consume data from input ports and produce data to output ports
+
+Create DFG: transform code to SSA (static single assignment), then draw graph from that. you get a partial ordering of operations.
+
+Solution:
+1. Transform binary code to another representation for better reasoning
+ - LLVM intermediate language: data flow tracking is possible, SSA form
+ - recursive disassembler
+ - data flow normalization: model memory as array, replace QEMU load/store with LLVM load/store, detect access to stack and make them SSA
+2. Use heuristics to select best candidates for parser functions
+ - compute score according to each feature
+ - combine (weighted sum) scores to single score, per function
+ - heuristics: looped switch statement, data flow analysis on conditional statements, and others
+
+
+
+