hashing.md (1823B)
1 +++ 2 title = "Hashing" 3 template = "page-math.html" 4 +++ 5 6 # Hashing 7 8 - hash tables are good for dictionaries 9 - keys starting at 0 10 - direct address table: 11 12 $\text{keys} \in {0,…,m-1}$ 13 14 $T[k] = \begin{cases} \text{nil} & \text{if no item with key k}\\\\ 15 \text{pointer to element with key k} & \text{otherwise}\end{cases}$ 16 17 - insert: set value at key 18 - delete: set value to nil 19 - drawbacks: lots of memory, keys must be int 20 - identity function key —> index 21 - for other purposes, change the mapping function 22 - simple example: 23 - keys are first names, value phone number 24 - hash function is name length mod 5 25 - choosing a hash function 26 - division: k ➝ (k mod m) 27 - easy but not good for all values 28 - so better to take prime not close to power of 2 29 - multiplication 30 - problem of collision: different keys hashed to same slot 31 - for p items, hash table size m, so mp possibilities for hash function 32 - $\frac{m!}{(m-p)!}$ possibilities for hashing without collision 33 - solving collisions 34 - chaining: put items that hash to same value in a linked list 35 - insertion in O(1) 36 - deletion in O(1) for doubly-linked (you wouldn’t use singly linked) 37 - searching in O(n) with n dictionary size, average in ϴ(1+α) 38 - α is load factor 39 - for n keys, m slots, 40 41 $\alpha = \frac{n}{m}$ 42 43 - open addressing 44 - make probe sequence: h: U × { 0…m-1} ➝ {0…m-1} 45 - linear probing 46 - for next probe, next address mod m 47 - h(k,i) = h’(k) + i mod m 48 - keep increasing i until empty place found 49 - but you get clustering, and removal hurts on a physical level 50 - double hashing, use two functions when the first function gives a collision