commit f0a084e9d4cbe161618dc2d01d299f530efe6190
parent 9ff9872df66ba4b8de062a223dc30340ab5bad37
Author: Alex Balgavy <alex@balgavy.eu>
Date: Thu, 21 Oct 2021 20:43:56 +0200
Update softsec notes
Diffstat:
6 files changed, 268 insertions(+), 0 deletions(-)
diff --git a/content/softsec-notes/_index.md b/content/softsec-notes/_index.md
@@ -13,3 +13,6 @@ title = 'Software Security'
9. [Temporal errors](temporal-errors)
10. [Type confusion (C++)](type-confusion-cpp)
11. [Defenses and bypassing them](defenses-and-bypassing-them)
+12. [Web Security](web-security)
+13. [AEG pipeline](aeg-pipeline)
+14. [Fuzzing](fuzzing)
diff --git a/content/softsec-notes/aeg-pipeline.md b/content/softsec-notes/aeg-pipeline.md
@@ -0,0 +1,14 @@
++++
+title = 'AEG pipeline'
++++
+# AEG pipeline
+AEG: automated exploit generation
+
+Multiple stages:
+1. find bugs: 0-day, n-day
+2. vulnerability specification: what kind? Inputs? depends on something?
+3. primitives generator: read, write, massage...
+4. surface explorer: power of primitive, constraints, reach (which bytes end up in interesting places)
+5. side effects: any bad effects?
+6. compiler for gadgets in binary
+7. resulting exploit
diff --git a/content/softsec-notes/aslr-brop.md b/content/softsec-notes/aslr-brop.md
@@ -0,0 +1,42 @@
++++
+title = 'ASLR BROP'
++++
+# ASLR BROP
+## Fine-grained ASLR
+Randomize even relative addresses
+- shuffle around (parts of) functions.
+- rewrite functions: change registers, replace instructions, add random NOPs
+- only possible at compile time, sharing (e.g. libraries) becomes difficult
+
+Breaking with JIT-ROP:
+- suppose attacker can leak single code pointer
+- then:
+ 1. Recursively
+ - use code pointers to read target code page (this is safe)
+ - identify gadgets on that code page
+ - leak code pointers on that page
+ 2. "Compile" ROP payload on the fly (Just In Time)
+
+Side channel: crash/no-crash
+- requirements: stack vulnerability and knowing how to trigger it, server process that respawns after crash
+
+Blind Return-Oriented Programming (BROP):
+1. Break ASLR
+ - stack reading:
+ - overwrite single byte with value X
+ - no crash: stack had value X
+ - crash: guess X was correct
+ - three types of gadgets:
+ - stop gadget: never crashes (always e.g. hangs)
+ - crash gadgets: always crashes
+ - useful gadget: crash depends on return
+2. Leak binary:
+ - remotely find enough gadgets to call write()
+ - might be a BROP gadget: pop rbx, rbp, r12, r13, r14, r15, ret
+ - at offset 0x7, yields pop rsi
+ - at offset 0x9, yields pop rdi
+ - finding it: a pop gadget will skip a crash gadget. so you can put 6 crash gadgets and a stop gadget on the stack, and the BROP gadget will not crash
+ - `pop rdx; ret` is rare, look instead for strcmp, which sets rdx to length of string
+ - find write and strcmp in PLT -- the jump table to external functions
+
+ - write() binary from memory to network to disassemble and find more gadgets to finish the exploit
diff --git a/content/softsec-notes/crypto.md b/content/softsec-notes/crypto.md
@@ -0,0 +1,163 @@
++++
+title = 'Crypto'
++++
+# Crypto
+Allows secure communication between two or more parties in presence of an attacker
+
+Goals:
+- confidentiality: Eve can't read Alice's message to Bob
+- integrity: Eve can't alter Alice's message without Bob finding out
+- authentication: Eve can't convince Bob she is Alice
+- non-repudiation: Even can't deny she sent a message to Bob
+
+Terms:
+- plaintext: readable text
+- ciphertext: unreadable text actually sent
+- encryption: convert plaintext to ciphertext
+- decryption: convert ciphertext to plaintext
+
+Kerckhoff's principle:
+- separate algorithm fro mkey
+- assume attacker knows algorithm
+- keep key secret
+
+Can't make the plaintext space larger ⇒ use a larger key space
+- instead of shifting by fixed amount, generate random permutation, mapping each letter to an arbitrary other one → 26! possible keys
+
+Attacks:
+- brute force
+- frequency analysis: some letters much more common than others, such as 'e' and 'a' in English
+
+End-to-end encryption: only sender and recipient have key
+
+Cryptographic hashes:
+- hash function maps message (arbitrary length) to hash (fixed length)
+- computationally hard to reverse -- "one-way function"
+- pre-image resistance: hard to find message m such that hash(m)=h
+- second pre-image resistance: given message m1, hard to find another message m2 such that hash(m1)=hash(m2)
+- collision resistance: hard to find any two messages m1 and m2 that have same hash
+
+Well-known hashes:
+- MD5: collision resistance long broken
+- SHA-1: collision resistance recently broken
+- SHA-2: collision resistance probably secure for some time
+- SHA-3: eliminates some theoretical weakness of earlier hashes
+
+Random numbers
+- CPUs are deterministic, so can't generate truly random numbers without special hardware
+- pseudo-random (PRNG): compute from seed, seed updated for every generated number
+- sequence is often predictable
+
+Blockchains:
+- each block of transactions contains hash of previous block
+ - ⇒ can't change past transactions without changing later ones
+- real chain is what above 50% of users (by compute power) agree on
+
+## Symmetric crypto
+Uses single key to encrypt and decrypt.
+
+One-time pad:
+- n-bit message, n-bit key
+- XOR key with message to encrypt/decrypt
+
+Perfect encryption
+- never reuse key
+- key is perfectly random
+- e.g.: one-time pad
+
+Stream ciphers:
+- use PRNG to generate key
+- initial seed is now real key
+
+Block ciphers:
+- divide data in blocks
+- compute using key to map each plain block into cipher block
+- e.g. DES, AES
+
+Cipher block chaining (CBC)
+- ECB still reveals repetitions, allows ordering blocks
+- CBC: make each block depend on previous one
+- first block: use initialization vector (IV) instead
+ - doesn't need to be secret
+ - shouldn't be reused
+
+Padding blocks:
+- message sizes often not multiple of block size
+- must be padded with extra bytes containing padding length
+- add block of padding even if multiple of key size
+
+Padding oracle attack:
+- assumptions: attacker can send encrypted message, server decrypts message and return error if padding incorrect, server uses block cipher with CBC
+- under these conditions, attacker can decrypt last block
+ - take: Dk decryption function with key k, n number of ciphertext blocks, Pi and Ci the plaintext and ciphertext block i
+ - $P_{n-1} = D_{k} (C_{n-1}) \oplus C_{n-2}$
+ - if we modify penultimate block of ciphertext ($C_{n-2}$), we change plaintext that server sees for last block ($P_{n-1}$)
+ - example:
+ 1. Write bytes of ciphertext c0-c15, plaintext p0-p15
+ 2. send c0..c7 ⨁ b...c15, for each b ∈ [0, 256)
+ - because block size 8, c7 is used to compute p15
+ 3. find that only b=0 (unchanged) and b=2 work
+ - so p15 XOR 2 = 1 (length 1 padding)
+ - so p15 = 1 XOR 2 = 3
+ - so plaintext has 3 bytes of padding
+ - so p13 = p14 = p15 = 3
+ 4. Now send c0...c4 XOR b, c5 XOR 7, c6 XOR 7, c7 XOR 7...c15
+ - 7 because 3 (plaintext) XOR 4 (new padding length) == 7
+ - c4 used to compute p12, because block size 8
+ 5. Find thatonly b=0x36 works
+ - so p12 XOR 0x36 = 4 (legnth 4 padding)
+ - so p12 = 0x36 XOR 4 = 0x32 = ASCII "2"
+ 6. Now send c0...c3 XOR b, c4 XOR 0x37, c5 XOR 6, c6 XOR 6, c7 XOR 6...c15
+ - 6 because 3 (plaintext) XOR 5 (padding) == 6
+ - 0x37 because 0x32 (plaintext) XOR 5 (padding) == 0x37
+ - c3 used to compute p11 (because block size 8)
+ - only b=0x31 works
+ - so p11 XOR 0x31 = 5 (length 5 padding)
+ - so p11 = 0x31 XOR 5 = 0x34 = ASCII "4"
+ 7. etc.
+
+Symmetric signatures/message authentication code
+- if message altered, no longer valid
+- key needed to generate valid
+- same key needed to validate
+
+
+## Asymmetric crypto
+Two keys:
+- anyone can encrypt with Bob's public key
+- only Bob knows private key, needed to decrypt
+
+Example is RSA
+
+## Key management
+Cannot trust Bob to tell Alice public key online:
+- either provide public key through separate trusted channel
+- or get trusted third party to authenticate keys
+
+Key distribution center (KDC):
+- trusted party
+- shares symmetric keys with all others
+- verifies identities of key owners
+
+Certification authority (CA)
+- asymmetric crypto alternative to KDC
+- issues certificates to verify identities
+- keeps revocation list of certificates that may be compromised
+- we need to trust these companies
+- X5.09 certificates contain identity, domain name(s), public key, expiration date, signature from CA
+
+Symmetric encryption key establishment
+- agree on key e.g. via Diffie-Hellman Key Exchange
+
+SSL and HTTPS
+- SSL (also known as TLS) is layer on top of TCP
+- any TCP protocol can be modified to run on top of SSL (e.g. HTTP ⇒ HTTPS)
+- end-to-end encryption: Diffie-Hellman to create shared encryption key
+- authenticate server using X5.09 certificate
+- typically does not authenticate client (server can do this in application layer)
+
+Password storage
+- hash the password
+- to make brute forcing harder:
+ - salt the hash (concatenate password and salt before hashing, store salt with the hash)
+ - use a slow hash, which slows down brute forcing but not normal use
diff --git a/content/softsec-notes/fuzzing/index.md b/content/softsec-notes/fuzzing/index.md
@@ -0,0 +1,46 @@
++++
+title = 'Fuzzing'
++++
+# Fuzzing
+Looking for bugs: code review, static analysis, testing
+- these are hard to scale
+
+Fuzzing/fuzz testing:
+- find vulnerabilities by repeatedly testing program with modified inputs
+- different kinds of fuzzers
+
+Different types of fuzzers:
+- input:
+ - mutational
+ - mutate seed inputs to create new test inputs
+ - common mutations:
+ - bit flip: flip nth bit of input (e.g. AFL fuzzer)
+ - byte flip: flip nth byte of input
+ - arithmetic: add n, subtract n, multiply by n, etc.
+ - insert/delete n bytes in input
+ - cross-over: combine two inputs (e.g. Vuzzer)
+ - ideally want to select fitness: discard unsuccessful inputs
+ - fitness e.g. does input execute new code, does input get us close to some target code
+ - problem: many invalid inputs because lack of input format
+ - generative (know full input grammar)
+ - learn/create format/model on input, use it to generate new inputs
+ - problem: hard to get right input format, might miss bugs related to invalid inputs
+- strategy: feedback, no feedback
+- app transparency: black-box, grey-box, white-box
+ - black-box: only interface known. difficult to access effectiveness
+ - white-box: all is known. better understanding of effect of input, but slow and harder to scale
+ - grey-box: use some knowledge
+- objective: guided/targeted, coverage-based
+ - coverage-based: the more code we cover, the more bugs we find (e.g. AFL, VUzzer)
+ - target: try to reach specific set of targets (e.g. Parmesan)
+ - hybrid: aim for targets while increasing coverage
+
+Improving traditional fuzzing:
+- apply heuristics to:
+ - mutate better
+ - learn good inputs
+- apply analysis to understand application behavior
+
+VUzzer:
+
+![Vuzzer diagram](vuzzer-diagram.png)
diff --git a/content/softsec-notes/fuzzing/vuzzer-diagram.png b/content/softsec-notes/fuzzing/vuzzer-diagram.png
Binary files differ.