index.md (4012B)
1 +++ 2 title = 'Numeric representation of data types' 3 +++ 4 # Numeric representation of data types 5 Everything’s stored in binary. Obviously. This is computers. 6 7 ## Signed Integers 8 Representing both positive and negative numbers. 9 10 Leftmost bit (MSB) tells state of sign flag — 0 for positive and 1 for negative 11 12 Systems: 13 14 - Sign-and-magnitude 15 - negative values are represented by changing MSB 16 - two representations for 0 — ±0 17 - 1’s-complement 18 - negative values are bitwise complement of positive 19 - for n-bit, equivalent to subtracting number from 2n-1 20 - two representations for 0 — ±0 21 - 2’s-complement 22 - 1’s-complement; then add 1 23 - in other words: for n-bit, subtract number from 2n 24 - one representation for 0 25 - can represent -8 in 4 bits 26 27 [Arithmetic operations with signed integers.](../addition-subtraction-with-signed-integers) 28 29 [How to design an actual circuit for this shit.](../addition-subtraction-logic-unit) 30 31 [Multiplication of signed integers](../multiplication-of-signed-integers). 32 33 Division is a pain in the ass, exactly the same as decimal long division. Just with 1s and 0s. 34 35 ## Floats 36 37 Useful website: [https://float.exposed/](https://float.exposed) 38 39 float in binary: sign for number, significant bits, signed scale factor exponent for implied base 2 40 41 IEEE standard (32 bit floats) — sign bit, 8-bit signed exponent in excess-127, 23-bit mantissa (fractional) 42 43 ![screenshot.png](screenshot-39.png) 44 45 The value stored in exponent is unsigned int E’ = E + 127 (excess-127). 46 E being unsigned int representation, E’ being excess 127. 47 48 Why excess-127? In 32 bits, you have 8 bits for the exponent. With 8 bits, you can represent values 0 to 255. But we want really small numbers, so a negative exponent. So the dudes at IEEE decided to go for -127 to +128. -127 (0) represents 0, 128 (255) represents infinity. So real range is -126 to +127. But the value in the exponent is an unsigned int, from 0 to 255, so the whole thing has to be shifted. Just define 0 to be -127 and you’re done. In other words, if you put a 0 in the exponent, you’re actually representing -127. 49 50 Confusing as shit. Basically if you want to write some value, you have to put that value + 127 in the exponent, in binary. 51 52 To convert to excess-127: 53 54 - convert in front of decimal point to binary (divide by 2 until no remainder, bits are in bottom-to-top order) 55 - convert after decimal point to binary (multiply by 2, left of decimal is next fractional 0 or 1, repeat with right of decimal) 56 - normalise it so that it’s of the format “1.M”, note the exponent E 57 - add 127 to E to form E’ 58 - *M* is mantissa, E’ is exponent 59 60 The number is normalised if it’s in the form “1.something × 2ⁿ”. 61 62 Special values of mantissa: 63 64 - exponent all 0, mantissa all 0 — 0 65 - exponent all 1, mantissa all 0 — ±Infinity 66 - exponent all 0, mantissa not 0 — denormalised numbers (implied 0 instead of 1) 67 - exponent all 1, mantissa not 0 — Not a Number 68 69 All operations use guard bits to keep accuracy. However, to store, you need to remove guard bits (truncate). 70 71 Methods: 72 73 - chopping — literally just slice off any extra bits 74 - von Neumann rounding — if the ones you remove are all 0, you chop them. but if any of them are 1, the LSB of the retained bits is set to 1. 75 - rounding — 1 added to value at LSB of retained if MSB of removed bits is 1. this rounds to an even number. 76 77 [Adding/subtracting floating point values](../adding-subtracting-floating-point-values). 78 79 [Multiplying/dividing floating point values.](../multiplying-dividing-floats) 80 81 ## Booleans 82 - false — 00000000 83 - true — literally anything else. often, 1 is used. 84 85 ## Characters 86 87 Common encoding is ASCII. Characters are represented by 7-bit codes. Alphabetic and numeric characters are in increasing sequential order. 88 89 Unicode has a large set of international alphabets, with variable width encoding (1-4 bytes, ASCII to Latin/Greek/Cyrillic/Coptic to Chinese/Hindi/tagalog to whatever else)