lectures.alex.balgavy.eu

Lecture notes from university.
git clone git://git.alex.balgavy.eu/lectures.alex.balgavy.eu.git
Log | Files | Refs | Submodules

commit e7aae3e65086a9736ffba47516ebd3d4dcbb2db2
parent 93d1647226c45729fb98ea81ffbb1f1b5d7be614
Author: Alex Balgavy <alex@balgavy.eu>
Date:   Mon,  1 Feb 2021 16:53:03 +0100

Lectures notes for multicore programming

Diffstat:
Mcontent/_index.md | 1+
Acontent/programming-multi-core-and-many-core-systems/_index.md | 5+++++
Acontent/programming-multi-core-and-many-core-systems/lecture-1.md | 62++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
3 files changed, 68 insertions(+), 0 deletions(-)

diff --git a/content/_index.md b/content/_index.md @@ -9,6 +9,7 @@ title = "Alex's university course notes" * [Computer & Network Security](computer-network-security/) * [Logical Verification](logical-verification/) * [Software Architecture](software-architecture/) +* [Programming Multi-Core and Many-Core Systems](programming-multi-core-and-many-core-systems) # BSc Computer Science (VU Amsterdam) --- diff --git a/content/programming-multi-core-and-many-core-systems/_index.md b/content/programming-multi-core-and-many-core-systems/_index.md @@ -0,0 +1,5 @@ ++++ +title = "Programming Multi-Core and Many-Core Systems" ++++ +# Table of contents +1. [Lecture 1](lecture-1) diff --git a/content/programming-multi-core-and-many-core-systems/lecture-1.md b/content/programming-multi-core-and-many-core-systems/lecture-1.md @@ -0,0 +1,62 @@ ++++ +title = 'Lecture 1' +template = 'page-math.html' ++++ + +Programming shared address space concurrent systems: +- App-facing high-level approach: almost ordinary sequential program, add compiler directives to parallelise + - structured parallelism, limited choice +- System-facing: explicitly multithreaded +- SIMD parallelism and vectorisation: single instruction, multiple data + - extra long (vector) registers + - store multiple values at once, apply same instruction to all values in vector + - can be automatically vectorised by compiler, or use compiler intrinsics manually + +Concurrency: tasks A and B are concurrent iff they can be performed independently of each other, yielding identical results. + +Parallelisation patterns: +- decomposition: + 1. partition problem into concurrent subproblems + 2. Solve each sub-problem independently + 3. Combine partial solutions into solution of initial problem +- functional decomposition: + 1. Partition problem into mostly independent functional units + 2. Each functional unit has own implementation + 3. functional units communicate with each other +- pipelining + - sequence of independent tasks + - task receives input data and produces output (fifo buffers) + - tasks are connected via streams (fifo buffers) +- domain (data structure) decomposition + - problem dominated by manipulation of large data structure + - decompose the data structure + - let each process manipulate its own part of data + - recombine data structure +- divide and conquer + - task solved sequentially or recursively split into subtasks + - fan-out defined by algorithm + - subtasks computed independently + - task waits for subtasks to finish + - task combines partial result + - only parent-child communication + +Organisation principle: parallel supersteps +- spatial split into number of subtasks +- fan-out is runtime constant +- temporal split into supersteps +- subtasks independent within each step +- barrier sync between steps +- communication only between steps + +How measure parallel performance? +- speedup: $S_{n} = \frac{T_{1}}{T_{n}}$ +- T₁: exec time using 1 core +- $𝑇_{𝑛}$: exec time using n cores + +Parallel program dev process: +1. formulate problem +2. write symbolic math specification +3. write sequential code +4. parallel code + +Parallelisation does not affect asymptotic complexity, only adds a constant factor