packers.md (4840B)
1 +++ 2 title = 'Packers' 3 +++ 4 5 # Packers 6 ## Binary packers 7 Packer takes binary program and makes a new program that has unpacker and packed version of P. 8 - the loader loads the new binary (unpacker), the unpacker unpacks and loads original program 9 10 ## What's a binary? 11 A binary is code in binary format (PE for Windows, ELF for Linux, Mach-O for Mac). 12 13 The format 14 - defines what the file looks like on disk and in memory 15 - contains info about machine to run it on, executable/library, entry point, sections 16 17 ELF format: 18 - used for executables, libraries, and others, on many architectures and OSes 19 - dual nature 20 - view on logical sections: described by section header table (`.data`, `.text`, `.bss`, etc.) 21 - view on structure in memory: what segments are executable and which are read/write (data), how large they are -- described by program header table 22 - structure 23 - elf header at beginning: magic number `7F 45 4C 46`, file type, architecture, entry point, program and section headers offset, string table offset 24 - program headers divide data in segments, providing easy mapping from data to memory 25 - array of structures for type of segment, position in ELF file, address in memory, physical address, size on disk, size in memory, flags for r/w/x, alignment in memory 26 - section headers define sections 27 - one entry for each section: index in string table, what kind of info it has, flags for write/alloc/exec, base address in memory, location in elf file, some other info 28 - elf program headers have everything that kernel needs to load file 29 - sections: 30 - examples: 31 - `.text`: code 32 - `.data`: initialised data 33 - `.bss`: uninitialized data 34 - `.got`/`.plt`: for dynamic linking 35 - `.ctors`/`.dtors`: constructors/destructors 36 - used at link time 37 - do not have predefined structure, but described by section headers that do 38 - symbol tables: 39 - SYMTAB: contains all symbols needed to link/debug files, not needed for running 40 - DYNSYM: contains symbols for dynamic linking, loaded in memory at runtime so as small as possible 41 42 ### Stripped binaries 43 Symbol table can be removed with `strip -s <program>` 44 - dynamic table has to be preserved for functions imported from shared libraries 45 - all names of functions and variables gone 46 47 ### Functions and global symbols 48 Address of global symbols imported from external libraries computed when binary loaded in memory 49 - can relocate or PIC (code freely relocatable, adds level of indirection via global offset table and procedure linking table) 50 - every time code has to reference global symbol, uses Global Offset Table (GOT, `.got`) in data section 51 - at runtime, GOT entries modified by dynamic linker to point to intended data 52 53 If code needs to call function in different module, dynamic linker creates array of read-only jump stubs: Procedure Linking Table (PLT, `.plt`) 54 - stubs use GOT entries to call right function 55 - lazy binding: initially point to resolver in `.plt counterpart` 56 - relocation confined to `.got` and `.got.plt` rather than `.text` 57 58 ### Process creation in Linux 59 - kernel loads segments defined by program headers into memory 60 - if interpreter defined, load it too 61 - kernel sets up stack and starts at interpreter's entry point 62 - if no interpreter, use process' entry point 63 64 ### ELF auxiliary vectors 65 Mechanism to transfer kernel level info to user processes (such as pointer to system call entry point in memory). 66 67 ELF Loader: 68 - parses ELF file 69 - maps various program segments in memory 70 - sets up entry point, initializes process stack 71 - puts ELF auxiliary vectors on process stack, along with argc, argv, envp 72 73 ## Packers 74 Initially for compression, but convenient for malware to evade antivirus, and many packers also have anti-debugging techniques. 75 76 We want to run the malware, let it unpack itself, and dump memory at the right moment (when it's completely unpacked). 77 - the right moment is when you have "normal behavior" 78 - check system calls, e.g. using `strace` 79 - you can dump memory in gdb: `dump binary memory dump_name start end` 80 81 ## Analysing a binary 82 ### Static 83 `file`: determine file type 84 85 `readelf`: display information about contents of ELF files 86 - `-h`: file header 87 - `-l` program headers 88 - `-S`: sections headers 89 - `-s`: symbol table 90 91 `ldd`: print shared libraries 92 93 `nm`: list symbols from object files 94 - `-D`: list dynamic symbols 95 96 `strings`: print strings of printable characters 97 98 ### Dynamic methods 99 `/proc/<pid>`: general information about process with `<pid>` 100 - `/cmdline`: command line 101 - `/environ`: environment 102 - `/maps`: memory map 103 104 `strace`: tracks system calls performed by process 105 - can also follow child process, show signals, decode syscall arguments 106 - `-i`: print instruction pointer at time of syscall 107 108 `ltrace`: tracks dynamically linked library calls