lectures.alex.balgavy.eu

Lecture notes from university.
git clone git://git.alex.balgavy.eu/lectures.alex.balgavy.eu.git
Log | Files | Refs | Submodules

index.md (1789B)


      1 +++
      2 title = 'Data center networking'
      3 +++
      4 
      5 ## Data center networking
      6 In data centers, servers are organized in interconnected racks.
      7 
      8 Performance metrics:
      9 - bisection width: minimum number of links cut to divide network in half
     10 - bisection bandwidth: minimum bandwidth of links that divide network in half
     11 - full bisection bandwidth: one half of nodes can communicate at the same time with other half of nodes
     12 
     13 Oversubscription: ratio (worst-case required aggregate bandwidth among end-hosts)/(total bisection bandwidth of topology)
     14 
     15 ### Fat-tree topology
     16 Example of 4-port fat-tree topology:
     17 
     18 ![Fat-tree topology diagram](./fat-tree-4-ports.png)
     19 
     20 Allows full bisection bandwidth between core and aggregation switches.
     21 
     22 For k-port switches:
     23 - need (5k²/4) switches
     24 - can have k³/4 servers
     25 
     26 Addressing:
     27 - pod switches: 10.pod.switch.1 (pod, switch ∈ [0, k-1])
     28 - core switches: 10.k.j.i (i and j denote core positions)
     29 - hosts: 10.pod.switch.id
     30 
     31 Forwarding: two-level lookup table
     32 - prefixes for forwarding traffic in pod
     33 - suffixes for forwarding traffic between pods
     34 
     35 Each host-to-host communication has single static path.
     36 
     37 Flow-collision happens when multiple flows use same path.
     38 Solutions:
     39 - equal-cost multi-path (ECMP): static path for each flow
     40 - flow scheduling: centralized scheduler assigns flows to paths
     41 
     42 
     43 Challenges & issues
     44 - must be backward compatible with IP/Ethernet
     45 - complex wiring
     46 - no support for seamless VM migration (would break TCP connection)
     47 - plug-and-play not possible, IPs have to be preassigned
     48 
     49 ### PortLand
     50 Separate node location (Pseudo MAC) from node identifier (Host IP).
     51 
     52 Fabric manager maintains IP → PMAC mapping .
     53 
     54 Switches self-discover location by exchanging Location Discovery Messages.
     55 
     56 ![Portland flow diagram](portland-flow-diagram.png)