commit dd31f0b32bc7c550a195f94815e55674380614cd
parent eeb0c56f7066780433945829620fd8d1cf425a61
Author: Alex Balgavy <alex@balgavy.eu>
Date: Sun, 4 Apr 2021 13:36:44 +0200
Update BAMA notes
Diffstat:
2 files changed, 109 insertions(+), 0 deletions(-)
diff --git a/content/binary-malware-analysis-notes/_index.md b/content/binary-malware-analysis-notes/_index.md
@@ -9,3 +9,4 @@ title = 'Binary and Malware Analysis'
4. [GDB](gdb)
5. [Anti-analysis](anti-analysis)
6. [Disassembly tools](disassembly-tools)
+7. [Packers](packers)
diff --git a/content/binary-malware-analysis-notes/packers.md b/content/binary-malware-analysis-notes/packers.md
@@ -0,0 +1,108 @@
++++
+title = 'Packers'
++++
+
+# Packers
+## Binary packers
+Packer takes binary program and makes a new program that has unpacker and packed version of P.
+- the loader loads the new binary (unpacker), the unpacker unpacks and loads original program
+
+## What's a binary?
+A binary is code in binary format (PE for Windows, ELF for Linux, Mach-O for Mac).
+
+The format
+- defines what the file looks like on disk and in memory
+- contains info about machine to run it on, executable/library, entry point, sections
+
+ELF format:
+- used for executables, libraries, and others, on many architectures and OSes
+- dual nature
+ - view on logical sections: described by section header table (`.data`, `.text`, `.bss`, etc.)
+ - view on structure in memory: what segments are executable and which are read/write (data), how large they are -- described by program header table
+- structure
+ - elf header at beginning: magic number `7F 45 4C 46`, file type, architecture, entry point, program and section headers offset, string table offset
+ - program headers divide data in segments, providing easy mapping from data to memory
+ - array of structures for type of segment, position in ELF file, address in memory, physical address, size on disk, size in memory, flags for r/w/x, alignment in memory
+ - section headers define sections
+ - one entry for each section: index in string table, what kind of info it has, flags for write/alloc/exec, base address in memory, location in elf file, some other info
+- elf program headers have everything that kernel needs to load file
+- sections:
+ - examples:
+ - `.text`: code
+ - `.data`: initialised data
+ - `.bss`: uninitialized data
+ - `.got`/`.plt`: for dynamic linking
+ - `.ctors`/`.dtors`: constructors/destructors
+ - used at link time
+ - do not have predefined structure, but described by section headers that do
+- symbol tables:
+ - SYMTAB: contains all symbols needed to link/debug files, not needed for running
+ - DYNSYM: contains symbols for dynamic linking, loaded in memory at runtime so as small as possible
+
+### Stripped binaries
+Symbol table can be removed with `strip -s <program>`
+- dynamic table has to be preserved for functions imported from shared libraries
+- all names of functions and variables gone
+
+### Functions and global symbols
+Address of global symbols imported from external libraries computed when binary loaded in memory
+- can relocate or PIC (code freely relocatable, adds level of indirection via global offset table and procedure linking table)
+- every time code has to reference global symbol, uses Global Offset Table (GOT, `.got`) in data section
+- at runtime, GOT entries modified by dynamic linker to point to intended data
+
+If code needs to call function in different module, dynamic linker creates array of read-only jump stubs: Procedure Linking Table (PLT, `.plt`)
+- stubs use GOT entries to call right function
+- lazy binding: initially point to resolver in `.plt counterpart`
+- relocation confined to `.got` and `.got.plt` rather than `.text`
+
+### Process creation in Linux
+- kernel loads segments defined by program headers into memory
+ - if interpreter defined, load it too
+- kernel sets up stack and starts at interpreter's entry point
+ - if no interpreter, use process' entry point
+
+### ELF auxiliary vectors
+Mechanism to transfer kernel level info to user processes (such as pointer to system call entry point in memory).
+
+ELF Loader:
+- parses ELF file
+- maps various program segments in memory
+- sets up entry point, initializes process stack
+- puts ELF auxiliary vectors on process stack, along with argc, argv, envp
+
+## Packers
+Initially for compression, but convenient for malware to evade antivirus, and many packers also have anti-debugging techniques.
+
+We want to run the malware, let it unpack itself, and dump memory at the right moment (when it's completely unpacked).
+- the right moment is when you have "normal behavior"
+- check system calls, e.g. using `strace`
+- you can dump memory in gdb: `dump binary memory dump_name start end`
+
+## Analysing a binary
+### Static
+`file`: determine file type
+
+`readelf`: display information about contents of ELF files
+- `-h`: file header
+- `-l` program headers
+- `-S`: sections headers
+- `-s`: symbol table
+
+`ldd`: print shared libraries
+
+`nm`: list symbols from object files
+- `-D`: list dynamic symbols
+
+`strings`: print strings of printable characters
+
+### Dynamic methods
+`/proc/<pid>`: general information about process with `<pid>`
+- `/cmdline`: command line
+- `/environ`: environment
+- `/maps`: memory map
+
+`strace`: tracks system calls performed by process
+- can also follow child process, show signals, decode syscall arguments
+- `-i`: print instruction pointer at time of syscall
+
+`ltrace`: tracks dynamically linked library calls