index.md (1408B)
1 +++ 2 title = "Huffman codes" 3 +++ 4 5 # Huffman codes 6 7 lossless encoding of characters, the ones that occur more frequently have a shorter encoding 8 9 e.g. encode “mississippi” 10 11 to generate a tree: 12 1. Build a table of values with frequencies: 13 14 <table> 15 <tr><td> m </td><td> i </td><td> s </td><td> p </td></tr> 16 <tr><td> 1 </td><td> 4 </td><td> 4 </td><td> 2 </td></tr> 17 </table> 18 19 1. Repeat until tree is complete: take two smallest nodes from list, create a new node that is sum of two smallest, add the two smallest as children (right > left), then add the parent node back into the list. 20 21 ![screenshot.png](b135949e77632135c4030b62d87c7b90.png) ![](f2219781d767bfb5463d415c5c4db2c5.png) ![](17f4bc951d880b7d2ef5d08a4480d254.png) 22 ![](a5ed3f890b17603f71c18a197523fbae.png) ![](c9c5e43a5e315bd28e089e69355a204f.png) 23 24 1. Assign weights to left-right branches — left is 0, right is 1. 25 26 ![](450eb15212318b83ae0db5f6ef700e4e.png) 27 28 1. Traverse tree to generate a code for each character: 29 30 <table> 31 <th>letter</th><th>code</th> 32 <tr><td> a </td><td> 1100 </td></tr> 33 <tr><td> b </td><td> 1101 </td></tr> 34 <tr><td> c </td><td> 100 </td></tr> 35 <tr><td> d </td><td> 101 </td></tr> 36 <tr><td> e </td><td> 111 </td></tr> 37 <tr><td> f </td><td> 0 </td></tr> 38 </table> 39 40 ## Complexity 41 with a min-heap-based min-priority queue, in O(nlogn)