lectures.alex.balgavy.eu

Lecture notes from university.
git clone git://git.alex.balgavy.eu/lectures.alex.balgavy.eu.git
Log | Files | Refs | Submodules

multiprocessing.md (4770B)


      1 +++
      2 title = 'Multiprocessing'
      3 +++
      4 # Multiprocessing
      5 ## Process creation
      6 ### Fork
      7 fork: creates new child process by duplicating calling process
      8 
      9 Simple part:
     10 - duplicate task: copy most info, with exceptions (e.g. PID)
     11 - allocate and initialize new kernel stack: setup `thread_info`, copy trap frame and update
     12 - and copy some other stuff
     13 
     14 `copy_mm`:
     15 - duplicate descriptor of address space of parent (`mm` descriptor): copy basic info, initialize empty address space (new page directory)
     16 - duplicate address space: copy VMA info, page tables, and fix up page table entries
     17     - `MAP_SHARED`: share page frames
     18     - `MAP_PRIVATE (R)`: share page frames
     19     - `MAP_PRIVATE (R/W)`: copy-on-write page frames
     20     - shared memory: permissions of VMA are identical to permissions of PTE
     21     - COW: permissions of VMA are RW, permissions of PTE are read-only
     22         - duplicate page frme and remap new private copy into faulting PTE on demand
     23 
     24 COW: copy on write, share the same resource until one copy is written to, at which point you duplicate the resource.
     25 - `MAP_PRIVATE` file pages - dedupes binary pages
     26 - `MAP_PRIVATE` anonymous forked pages - dedupes pages in process hierarchy
     27 - `MAP_PRIVATE` anonymous forked pages - dedupes zero pages for unrelated processes
     28 
     29 ### Exec
     30 Executes program pointed by filename.
     31 
     32 implementation:
     33 - input and permission checking
     34 - load binary headers in memory
     35 - find binary format
     36 - flush old resources:
     37     - reinit `task_struct` and empty `mm`
     38     - flush VMAs, page tables, page frames
     39 - load binary (binary-format specific)
     40     - parse headers and sections
     41     - create corresponding VMAs (data, text, stack, etc.)
     42     - update `%rsp` and `%rip` in trap frame
     43         - `%rsp` is top of user stack
     44         - `%rip`: program entry point for statically linked, dynamic linker's entry point for dynamically linked
     45     - page tables initially empty
     46         - even binary files are demand paged
     47         - PF handler maps them from cache using COW-based strategy
     48 
     49 ## Scheduling
     50 easy version:
     51 - hardware raises periodic timer interrupts
     52 - timer interrupt handler invokes simple scheduler
     53 - simple scheduler
     54     - FIFO scheduling queue
     55     - enqueue interrupted process at tail
     56     - dequeue process at heat and run on CPU
     57 
     58 Time management:
     59 - hardware offers clock and timer circuits
     60 - clock circuits: expose counters incd at given frequency, can be used for time measurements and tracking time of day
     61 - timer circuits: issue periodic interrupts at given frequency, can be used for scheduling and to implement timers
     62 
     63 Linux clock sources:
     64 - tsc: timestamp counter
     65 - hpet: high-precision event timer
     66 - acpi_pm: ACPI power management timer
     67 - real time clock (RTC): persistent (battery-powered), low-precision (seconds)
     68 
     69 Linux clock event devices:
     70 - abstraction of timer circuit
     71 - clock event device programmed to issue interrupts at `CONFIG_HZ` frequency
     72 - both user and kernel preemption possible
     73 
     74 Scheduling per-task building blocks:
     75 - state: running, runnable, sleeping
     76 - quantum or time slice: max number of jiffies a task can run on CPU
     77 - priority
     78 - scheduling policy: fair, real-time, etc.
     79 
     80 Linux O(1) scheduler:
     81 - preemptive round-robin priority scheduler
     82     - preemptive: uses timer interrupts to preempt execution of user processes and maybe reschedule before a process is done (when it runs out of its time-slice/quantum)
     83     - round-robin: goes around all processes
     84     - priority: some processes are more important
     85 - maintains N run queues (1 per priority level)
     86 - strategy:
     87     - find highest priority queue with runnable task
     88     - find first task on that queue and dequeue it
     89     - calculate its time slice size based on priority
     90     - run it
     91     - when time's up, enqueue
     92     - repeat
     93 - improving fairness:
     94     - priorities adjusted based on sleep time
     95     - bonuses for IO vs CPU bound processes
     96 - all operations are O(1)
     97 
     98 Linux CFS scheduler:
     99 - CFS: tasks get a completely fair CPU share
    100 - with N tasks:
    101     - record how much CPU time each task has
    102     - schedule task with biggest delta to `tot_CPU_time/N`
    103         - always schedule the task that has spent least CPU time
    104     - virtualise time to deal with priorities
    105     - increase virtual runtime faster for lower-priority tasks
    106 - no heuristics to distinguish tasks
    107 - uses red-black tree instead of run queues
    108 
    109 ## IPC
    110 communications, sync, signals
    111 
    112 Shared mem:
    113 - System V: get/create shmem segment by key, attach/detach
    114 
    115 semaphores:
    116 - counter shared across processes
    117 - atomic wait (decrement), release (increment) if val >= -sem\_op
    118 
    119 message queue:
    120 - mailbox with senders and receivers
    121 - get/create queue by key & permission
    122 - block if queue full (on send) or empty (on receive)
    123 
    124 POSIX: uses names instead of keys, thread safety, ref counting, shmem is file oriented