index.md (7321B)
1 +++ 2 title = 'Introduction' 3 +++ 4 5 # Intro 6 ## Distributed vs uniprocessor 7 - Lack of knowledge on global state: process has no up-to-date knowledge on states of other processes 8 - lack of global time frame 9 - nondeterminism 10 11 ## Communication in distributed system 12 Main paradigms: 13 - message passing 14 - shared memory 15 16 Asynchronous communication: sending and receiving are independent events (synchronous is the opposite) 17 18 Communication protocol detects and corrects errors during message passing 19 20 Assumptions we make: 21 - strongly connected network (every node/process can reach any other node) 22 - each process knows only its neighbors (only local knowledge) 23 - message passing for communication 24 - asynchronous communication 25 - channels can be non-FIFO (messages can overtake each other) 26 - channels don't lose, duplicate, or mess up messages 27 - delay of messages in channels is arbitrary, but not infinite 28 - stable network and processes don't crash 29 - processes have unique IDs 30 31 Directed vs bidirectional channels 32 - directed: one way 33 - bidirectional: messages can go both ways 34 35 complexity measures: 36 - message complexity: total num messages exchanged 37 - bit complexity: total num bits exchanged 38 - time complexity: amount of time consumed (assume even processing takes no time, message received max one time unit after it's sent) 39 - space complexity: amount of memory needed for the processes 40 - we consider worst- and average-case complexity 41 42 Big O notation 43 - complexity measures show how resource consumption grows _in relation to input size_ 44 - an algorithm with worst-case message complexity (n²) for input size n takes max in order of n² messages 45 - "in the order of": give or take some constant 46 - f = O(g) if for some C > 0, f(n) ≤ C·g(n) for all n ∈ ℕ 47 - f = Θ(g) if f = O(g) and g = O(f) (both upper and lower bound) 48 49 Transition systems 50 - global state of distributed system is a "configuration" 51 - configuration evolves in discrete steps called "transitions" 52 - transition system: 53 - set C of configurations 54 - binary transition relation → on C 55 - and set I ⊆ C of initial configurations 56 - A state is terminal if there are no transitions from it 57 58 Execution: sequence γ₀ γ₁ γ₂... of configurations that 59 - either is infinite, 60 - or ends in terminal configuration such that 61 - γ₀ ∈ L and 62 - γⱼ → γⱼ₊₁ for j = 0,1,2,... 63 - configuration is reachable if there is a sequence of transitions from an initial state to it 64 65 States and events 66 - configuration of distributed system: states at its processes & messages in its channels 67 - transition associated to event (or two events if synchronous) at one (or two) of its processes 68 - events: internal, send, receive 69 - initiator process: if its first even is internal or send 70 - algorithm is centralised if only one initiator 71 - decentralized if more than one 72 73 Assertion: 74 - predicate on configurations of an algorithm 75 - safety property is always true in each configuration ("something bad will never happen") 76 - liveness property is true in some configuration of each execution ("something good will eventually happen") 77 78 Invariants: 79 - assertion P on configurations is invariant if: 80 - holds for all initial states 81 - if holds in states on both sides of transitions 82 - each invariant is a safety property 83 84 Causal order 85 - in configuration of async system, applicable events at different processes are independent 86 - causal order (≺ symbol) on occurrences of evens in execution is smallest transitive relation st 87 - in english, if a happens before b, then a ≺ b 88 - full definition: 89 - if events a,b at same process, and a occurs before b, then a ≺ b 90 - if a send and b corresponding receive, then a ≺ b 91 - _irreflexive!_ (you clearly can't have a before b _and_ b before a) 92 93 Computations 94 - if neither a ⪯ b nor b ⪯ a, then a and b are concurrent 95 - permutation of concurrent events in execution doesn't affect result of execution 96 - these permutations form a computation 97 98 ## Clocks 99 Lamport's clock 100 - logical clock C maps occurrences of events in computation to a partially ordered set, such that a ≺ b ⇒ C(a) < C(b) 101 - Lamport's clock LC assigns to each event a the length of k of longest causality chain a₁ ≺ ... ≺ aₓ = a 102 - LC can be computed at runtime: 103 - one list of clock values per process, length of the list is same as number of messages 104 - each message increases the clock value by 1 105 - on a receive, wait until the corresponding send is done, and then the receive has the incremented clock value 106 107 Vector clock 108 - given processes p₀...pₓ₋₁ 109 - each process has a list of vectors (k₀..kₓ₋₁) that are clock values corresponding to processes (e.g. k₀ is first process, k₁ is second process, etc.) 110 - a message increments the process' clock value by 1 111 - a receive message also takes the maximum of the other clock values in the process' vector and the values in the vector of the process with the corresponding send message 112 113 ![Vector clock diagram](vector-clock.png) 114 115 ## Snapshots 116 snapshot of execution of distributed algorithm should return configuration of execution in the same computation 117 118 distinguish: basic messages of underlying distributed algorithm, control messages of snapshot algorithm 119 120 snapshot of basic execution contains 121 - local snapshot of basic state of each process 122 - channel state of basic messages in transit for each channel 123 124 meaningful snapshot: if configuration of execution in same computation as actual execution 125 - for each message m, sender, p, receiver q, must agree whether m is pre- or post-snapshot 126 127 ### Chandy-Lamport algorithm 128 decentralised. 129 assumes directed network with FIFO channels. 130 131 1. Some node takes a snapshot, starts recording on channels, then sends out marker messages across all channels before any other mesages 132 2. When a node receives the marker control message: 133 - if this is the first one it received: 134 - take a snapshot (record its own state) 135 - mark the corresponding channel as empty 136 - start recording on all other channels 137 - send out marker messages on _all_ channels, before any other messages 138 - else: 139 - stop recording the corresponding channel 140 - set that channel's state to all messages received since the snapshot 141 142 [This is a very good explanation](http://composition.al/blog/2019/04/26/an-example-run-of-the-chandy-lamport-snapshot-algorithm/) 143 144 Complexity: 145 - message: Θ(E) (with E number of channels) 146 - worst-case time: (D) (with D the diameter) 147 148 ### Lai-Yang algorithm 149 allows that channels are non-FIFO 150 151 - initiators take local snapshot of their state 152 - when process takes local snapshot, appends 'true' to each outgoing basic message 153 - if process that hasn't yet taken snapshot receives message with 'true', or control message, for the first time, it takes local snapshot before reception of the message 154 - channel state is basic messages without 'true' that receives after its local snapshot 155 - processes count how many basic messages without 'true' they sent/received for each channel 156 - when p takes a snapshot, p sends a control message to q, telling q how many basic messages without 'true' were sent to pq 157 - for multiple snapshots, each snapshot has sequence number and basic message carries sequence number of last snapshot at sender instead of 'true'