index.md - lectures.alex.balgavy.eu - Lecture notes from university.

index.md (9828B)
      1 +++
      2 title = 'Defenses and bypassing them'
      3 +++
      4 # Defenses and bypassing them
      5 
      6 Techniques that can make attacks harder:
      7 - protect sensitive data from leakage/corruption
      8 - make using corruption for code execution harder
      9 - detect undefined behavior
     10 
     11 Static analysis: can we prove code never results in undefined behavior?
     12 - incomplete due to halting problem
     13 
     14 Dynamic instrumentation: runtime checks for undefined behavior
     15 - negative performance impact
     16 - false positives mean crashing a correct program
     17 
     18 ![Defenses overview](defenses-overview.png)
     19 
     20 ## Stack canaries
     21 Value between local vars and return address, compiler initializes with random value in function prologue.
     22 On return, check whether value is still the same, and crash if it isn't.
     23 
     24 Enable with `-fstack-protector`.
     25 `%fs:40` is randomized at thread creation, canary is never left in a register, and is moved to a position in the stack, as a local variable.
     26 
     27 Can still exploit, if we can jump over the canary. E.g. in this example, overwrite the len variable so memcpy writes to it:
     28 
     29 ```c
     30 void echo(int fd) {
     31     long len;
     32     char name[64], reply[128];
     33     /* ... */
     34     read(fd, name, 128);
     35     memcpy(reply + len, name, 64);
     36     /* ... */
     37 }
     38 ```
     39 
     40 Three approaches:
     41 - jump over canary: corrupt pointer/index used to write memory, write target address past canary. need two writes - either two vulnerabilities or a loop.
     42 - leaking canaries: leak canary, overwrite with same value. need leak vulnerability, ability to provide more input after leak and buffer overflow with new input.
     43 - brute force canaries: guessing randomly out of 2⁶⁴ possibilities is impossible. but can brute force if we can corrupt part of canary, program restarts with same canary, and get feedback about crashes.
     44     - overflow into canary by one byte, try all values and find which is correct for first byte (doesn't crash). Try next bytes until full canary found.
     45     - max 8×2⁸ attempts needed
     46 
     47 ## Data execution prevention (DEP)
     48 OS marks data pages as non-executable (no-execute bit in page tables).
     49 - attempt to execute data (like shellcode) gives segfault
     50 
     51 W⊕X: no memory is both writable and executable.
     52 
     53 We can still exploit this, but no longer by jumping to own shellcode - need to reuse existing code.
     54 
     55 Can use shared library functions (return-to-libc), or chain together parts of code (return-oriented programming).
     56 
     57 ### Ret2libc
     58 On x86\_32, params pass on the stack, so params to function can be easily set.
     59 
     60 In 64-bit, parameters are passed in registers, so only works if parameter is in %rdi.
     61 Maybe reuse additional code to load desired %rdi value (e.g. with a piece of code that pops to %rdi and returns).
     62 
     63 ### Return oriented programming (ROP)
     64 Any sequence of instructions ending in RET is a 'gadget', which can be chained together.
     65 Return oriented programming is a chain of such gadgets.
     66 
     67 Might want some useful ones:
     68 - load value to register (pop)
     69 - read from memory (mov (reg), reg)
     70 - write to memory (mov reg, (reg))
     71 - computation (add, sub, etc.)
     72 - syscall
     73 
     74 Given the right gadgets, can do arbitrary computation without any code.
     75 Tools help generate ROP chains, e.g. [Ropper](https://github.com/sashs/Ropper)
     76 
     77 If we run out of writable stack, build a new stack anywhere in memory (such as attacker-controlled heap memory).
     78 Function epilogue can load stack pointer (by using it twice).
     79 This is a "stack pivot".
     80 
     81 If we can't find the right gadget, use libraries, especially libc.
     82 Can jump into middle of an instruction.
     83 
     84 ## Address space layout randomization (ASLR)
     85 Randomizes memory addresses of code, data, heap, and stack.
     86 Prevents attacker from finding code pointer to overwrite, or knowing what to overwrite it with.
     87 
     88 Stack and heap randomized by OS alone:
     89 - stack: determined by %rsp, set in `execve()`
     90 - heap: brk() and mmap() return values
     91 
     92 Code and data require compiler support
     93 - absolute code/data refs would break with randomized addresses
     94 - position independent code (PIC) forces all pointers to be relative to instruction pointer
     95     - libraries used PIC to prevent clashes in address space
     96     - executables can also use PIC (`-fPIE`)
     97 
     98 Can still be attacked:
     99 - relative pointers don't change
    100 - if one pointer is leaked, all others can be computed
    101 - so leak pointers from stack/heap, or use side channels to recover complete address space
    102 
    103 ### Information hiding
    104 ASLR: randomness limited to base:
    105 - first shared object is loaded at random position
    106 - next object located right below (lower addresses) the last object
    107 - ⇒ all libraries located side by side at single random place
    108 
    109 Shadow stack:
    110 - move sensitive data (e.g. return address) on separate stack
    111 - make sure shadow stack is protected
    112 - real hiding:
    113     - no pointer in memory should refer to secret location of shadow stack
    114     - only dedicated register (e.g. `%gs:108`) points to shadow stack
    115 - how to find it:
    116     - address space typically has some holes
    117     - don't look for hidden region, look for holes -- one larger and one smaller -- surrounding the region
    118     - even if we remove all pointers, we still have the size of the hole left that indicates the hidden region
    119     - repeatedly allocate large chunks of memory until we find the "right size": if allocation succeeds, hole it is that size or bigger. if fails, hole is smaller.
    120     - ephemeral allocation: allocate and free within the request
    121     - persistent allocation: allocate 'permanently' (can also use ephemeral and just not complete the request)
    122     - steps:
    123         1. determine large hole using ephemeral allocation
    124         2. allocate large hole using persistent allocation
    125         3. run ephemeral allocation algorithm again, giving us small hole
    126 
    127 ## Control-flow integrity
    128 Prevent turing completeness with ROP chains.
    129 
    130 idea:
    131 - only allow "legitimate branches and calls"
    132 - i.e. those that follow control flow graph
    133 - give valid targets a label, and check that we don't branch anywhere else
    134 
    135 May be combined with runtime shadow stack.
    136 
    137 in practice
    138 - requires precise CFG (from source code or debug info)
    139 - large performance overhead
    140 - so, coarse-grained CFI:
    141     - one common label for all call sires
    142     - one common label for all entry points
    143 
    144 gadgets left after this:
    145 - entry point gadget - jump to entry point and go up to next indirect call/jump
    146 - call site gadget - jump to call site, go to the next return
    147 
    148 ### Data oriented programming
    149 Perfect CFI means CFG is never violated.
    150 But, data guides code through CFG, so by manipulating data we change control flow.
    151 
    152 Attacker can overwrite data e.g. using buffer overflow, and overwritten data drives a dispatching data
    153 - loop executes as often as attacker wants
    154 - loop performs several operations, providing gadgets
    155 - attacker controls values and pointers used in the operations
    156 
    157 ![Data oriented programming example](dop-example.png)
    158 
    159 Approach:
    160 - find gadget dispatcher (attacker-controlled loop)
    161 - identify and classify gadgets
    162 - convert workload into a sequence of gadget operations
    163 - build a sequence of buffers to trigger those operations
    164 - send the buffers to the target machine
    165 
    166 ## Sanitizers
    167 Fully detect or mostly detect particular vulnerabilities.
    168 Big performance overhead, so not really used in production settings, but useful for debugging and testing.
    169 
    170 Sanitizers in GCC/LLVM
    171 - ASan: buffer overflow, memory leak, use-after-free
    172 - Leak sanitizer: memory leaks
    173 - TSan: race conditions
    174 - UBSan: undefined arithmetic
    175 - MSan: uninitialized read
    176 
    177 Address sanitizer (ASan):
    178 - detects buffer overflow and use-after-free
    179 - shadow memory tracks allocation status:
    180     - add check before memory access
    181     - shadow addr = (addr >> 3) + offset
    182     - red zones between allocations
    183     - deallocated memory in quarantine
    184 - drawbacks: compatibility (if program depends on memory layout), performance, incomplete (no overflows in structs, misses overflows that jump over read zone, use-after-free still possible with memory massaging)
    185 
    186 Memory sanitizer (MSan):
    187 - detects uninitialized reads
    188 - loading values allowed
    189     - uninitialized status tracked in shadow memory
    190     - 1-y-1 per-bit shadow mapping
    191     - allocation: memory poisoned
    192     - computation: poison propagated based on operation
    193 - poison checked at: conditional branches, syscalls, pointer derefs
    194 
    195 ### Research topics
    196 Delta pointers: fast buffer overflow detection
    197 - tagged pointers use some pointer bits for metadata
    198 - checks are implicit using MMu
    199 - cannot detect all cases (like underflow), but much better performance
    200 
    201 SafeInit: automatically initialize to zero
    202 - compiler: every local variable
    203 - allocator: every heap allocation
    204 - optimizations:
    205     - only close to first use
    206     - only one byte for strings
    207     - dead store elimination: prevent initializations that are later overwritten in all cases
    208     - rely on OS zeroing for large heap allocations
    209 
    210 DangSan: prevents use after free
    211 - invalidates dangling pointers
    212 - keep list of pointers to each object
    213 - on pointer assignment: keep track of new pointer to object
    214 - free: set most significant bit of remaining pointers
    215 - complications:
    216     - which object does pointer point to? use shadow memory
    217     - what if multiple threads copy pointers to same object? use lock-free data structure (it's a write-mostly workload, so use per-thread append-only log)
    218 
    219 Type-after-type
    220 - allow dangling pointers, but only to same type
    221 - separate heap and stack for each type
    222     - never reuse memory used for one type for another
    223     - dangling pointer keeps pointing to same type
    224 - challenge: type inference
    225     - use static analysis to guess from context
    226     - explicit type on stack and with new operator, trace result to pointer cast, or sizeof in malloc size
    227 
    228 TypeSan
    229 - type confusion mitigation requires knowing runtime type of object on static_cast
    230 - use shadow memory, translate each pointer to set of allowable casts (which is determined at compile time)
	lectures.alex.balgavy.eu Lecture notes from university.
	git clone git://git.alex.balgavy.eu/lectures.alex.balgavy.eu.git
	Log \| Files \| Refs \| Submodules