Update PMMS notes - lectures.alex.balgavy.eu - Lecture notes from university.

commit e5bfadbaa5b6117a488df95cdbb0abecac410c84
parent 30921a85b7cb68945d167df666f1443a10141a7a
Author: Alex Balgavy <alex@balgavy.eu>
Date:   Thu, 18 Feb 2021 21:36:51 +0100

Update PMMS notes

Diffstat:
M content/programming-multi-core-and-many-core-systems/_index.md  | 1 +
A content/programming-multi-core-and-many-core-systems/lecture-5.md  | 60 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

2 files changed, 61 insertions(+), 0 deletions(-)
diff --git a/content/programming-multi-core-and-many-core-systems/_index.md b/content/programming-multi-core-and-many-core-systems/_index.md
@@ -6,3 +6,4 @@ title = "Programming Multi-Core and Many-Core Systems"
 2. [Lecture 2](lecture-2)
 3. [Lecture 3](lecture-3)
 4. [Lecture 4](lecture-4)
+5. [Lecture 5](lecture-5)
diff --git a/content/programming-multi-core-and-many-core-systems/lecture-5.md b/content/programming-multi-core-and-many-core-systems/lecture-5.md
@@ -0,0 +1,60 @@
++++
+title = 'Lecture 5'
++++
+## Finishing OpenMP
+Controlling thread affinity
+
+OpenMP thread binding:
+- bind thread to place (core)
+- once bound, thread does not migrate
+
+OpenMP parallel proc_bind clause:
+- `#pragma omp parallel proc_bind(master)`: whole team runs on same place as master
+- `#pragma omp parallell proc_bind(close)`: whole team runs on place close to parent (`spread` is the opposite)
+
+places:
+- basic computing resource, usually hardware thread
+- describe hierarchic system architecture to OpenMP
+- env variable `OMP_PLACES`: "{0,1,2,3},{4,5,6,7},{8:4},{12:4}"
+    - sets 4 places with 4 execution units
+- the actual effect is implementation-dependent
+
+busy-wait vs suspension:
+- problem: threads wait all the time, at synchronisation barriers and critical sections
+- busy wait: threads actively poll for some memory address, advantage is threads can quickly proceed when possible, but they waste resources while waiting
+- suspension: threads de-schedule to waiting queue, so they don't occupy resources, but they have to wake up to be used
+- use variable `OMP_WAIT_POLICY`, `ACTIVE` for mostly busy wait, `PASSIVE` for mostly suspension (but not every compiler might know)
+
+Atomic operations:
+- atomic operations can be mapped to hardware features, but can also be mapped to critical sections
+- no guarantee on operational behaviour
+- can still be expensive due to data transfer
+
+So, OpenMP:
+- simpler to use that system-facing approaches
+- responsible for most organisational aspects
+- facilitates experimentation with alternatives
+- false directives can lead to wrong or nondeterministic behaviour
+    - easy to write, hard to find
+- insight into impact is reduced
+
+
+---
+
+## POSIX Threads
+POSIX threads are low-level, system-facing.
+Library-based (`pthread.h`), don't need a compiler that understands special directives.
+
+Thread creation:
+- only one thread at program startup
+- all threads created dynamically
+- threads may terminate before process
+- process terminates when all threads terminate
+- any thread can wait for termination of any other thread ("joining")
+    - terminated threads are zombies until joined
+- functions: `pthread_create`, `pthread_join`
+- joinable can be joined, detached cannot be joined
+
+Two classes of threads:
+- library threads: OS unaware, whole process blocked when waiting for OS service, implementation can be more efficient
+ OS/kernel threads: scheduled by OS together with other processes, pre-emptive multithreading, allows overlapping computation and waiting

	lectures.alex.balgavy.eu Lecture notes from university.
	git clone git://git.alex.balgavy.eu/lectures.alex.balgavy.eu.git
	Log \| Files \| Refs \| Submodules

M	content/programming-multi-core-and-many-core-systems/_index.md	\|	1	+
A	content/programming-multi-core-and-many-core-systems/lecture-5.md	\|	60	++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++