lectures.alex.balgavy.eu

Lecture notes from university.
git clone git://git.alex.balgavy.eu/lectures.alex.balgavy.eu.git
Log | Files | Refs | Submodules

packers.md (4840B)


      1 +++
      2 title = 'Packers'
      3 +++
      4 
      5 # Packers
      6 ## Binary packers
      7 Packer takes binary program and makes a new program that has unpacker and packed version of P.
      8 - the loader loads the new binary (unpacker), the unpacker unpacks and loads original program
      9 
     10 ## What's a binary?
     11 A binary is code in binary format (PE for Windows, ELF for Linux, Mach-O for Mac).
     12 
     13 The format
     14 - defines what the file looks like on disk and in memory
     15 - contains info about machine to run it on, executable/library, entry point, sections
     16 
     17 ELF format:
     18 - used for executables, libraries, and others, on many architectures and OSes
     19 - dual nature
     20     - view on logical sections: described by section header table (`.data`, `.text`, `.bss`, etc.)
     21     - view on structure in memory: what segments are executable and which are read/write (data), how large they are -- described by program header table
     22 - structure
     23     - elf header at beginning: magic number `7F 45 4C 46`, file type, architecture, entry point, program and section headers offset, string table offset
     24     - program headers divide data in segments, providing easy mapping from data to memory
     25         - array of structures for type of segment, position in ELF file, address in memory, physical address, size on disk, size in memory, flags for r/w/x, alignment in memory
     26     - section headers define sections
     27         - one entry for each section: index in string table, what kind of info it has, flags for write/alloc/exec, base address in memory, location in elf file, some other info
     28 - elf program headers have everything that kernel needs to load file
     29 - sections:
     30     - examples:
     31         - `.text`: code
     32         - `.data`: initialised data
     33         - `.bss`: uninitialized data
     34         - `.got`/`.plt`: for dynamic linking
     35         - `.ctors`/`.dtors`: constructors/destructors
     36     - used at link time
     37     - do not have predefined structure, but described by section headers that do
     38 - symbol tables:
     39     - SYMTAB: contains all symbols needed to link/debug files, not needed for running
     40     - DYNSYM: contains symbols for dynamic linking, loaded in memory at runtime so as small as possible
     41 
     42 ### Stripped binaries
     43 Symbol table can be removed with `strip -s <program>`
     44 - dynamic table has to be preserved for functions imported from shared libraries
     45 - all names of functions and variables gone
     46 
     47 ### Functions and global symbols
     48 Address of global symbols imported from external libraries computed when binary loaded in memory
     49 - can relocate or PIC (code freely relocatable, adds level of indirection via global offset table and procedure linking table)
     50 - every time code has to reference global symbol, uses Global Offset Table (GOT, `.got`) in data section
     51 - at runtime, GOT entries modified by dynamic linker to point to intended data
     52 
     53 If code needs to call function in different module, dynamic linker creates array of read-only jump stubs: Procedure Linking Table (PLT, `.plt`)
     54 - stubs use GOT entries to call right function
     55 - lazy binding: initially point to resolver in `.plt counterpart`
     56 - relocation confined to `.got` and `.got.plt` rather than `.text`
     57 
     58 ### Process creation in Linux
     59 - kernel loads segments defined by program headers into memory
     60     - if interpreter defined, load it too
     61 - kernel sets up stack and starts at interpreter's entry point
     62     - if no interpreter, use process' entry point
     63 
     64 ### ELF auxiliary vectors
     65 Mechanism to transfer kernel level info to user processes (such as pointer to system call entry point in memory).
     66 
     67 ELF Loader:
     68 - parses ELF file
     69 - maps various program segments in memory
     70 - sets up entry point, initializes process stack
     71 - puts ELF auxiliary vectors on process stack, along with argc, argv, envp
     72 
     73 ## Packers
     74 Initially for compression, but convenient for malware to evade antivirus, and many packers also have anti-debugging techniques.
     75 
     76 We want to run the malware, let it unpack itself, and dump memory at the right moment (when it's completely unpacked).
     77 - the right moment is when you have "normal behavior"
     78 - check system calls, e.g. using `strace`
     79 - you can dump memory in gdb: `dump binary memory dump_name start end`
     80 
     81 ## Analysing a binary
     82 ### Static
     83 `file`: determine file type
     84 
     85 `readelf`: display information about contents of ELF files
     86 - `-h`: file header
     87 - `-l` program headers
     88 - `-S`: sections headers
     89 - `-s`: symbol table
     90 
     91 `ldd`: print shared libraries
     92 
     93 `nm`: list symbols from object files
     94 - `-D`: list dynamic symbols
     95 
     96 `strings`: print strings of printable characters
     97 
     98 ### Dynamic methods
     99 `/proc/<pid>`: general information about process with `<pid>`
    100 - `/cmdline`: command line
    101 - `/environ`: environment
    102 - `/maps`: memory map
    103 
    104 `strace`: tracks system calls performed by process
    105 - can also follow child process, show signals, decode syscall arguments
    106 - `-i`: print instruction pointer at time of syscall
    107 
    108 `ltrace`: tracks dynamically linked library calls