lectures.alex.balgavy.eu

Lecture notes from university.
git clone git://git.alex.balgavy.eu/lectures.alex.balgavy.eu.git
Log | Files | Refs | Submodules

lecture-4.md (2084B)


      1 +++
      2 title = 'Lecture 4'
      3 +++
      4 
      5 <!-- TODO: finish -->
      6 problem: parallel execution incurs overhead (creation of worker threads, scheduling, waiting at sync barrier). so overhead must be outweighed by sufficient workload, i.e. loop body and trip count.
      7 
      8 conditional parallelisation uses if clause: `#pragma omp parallel for if (len >= 1000)`. so you parallelise only at some threshold.
      9 
     10 loop scheduling determines which iterations execute on which thread, aim is to distribute workload equally
     11 - can use `#pragma omp parallel for schedule(<type> [, <chunk>])`, which selects out of set of scheduling techniques
     12 - static/block scheduling: loop subdivided into as many chunks as threads with `#pragma omp parallel for schedule(static)`
     13 - static scheduling with chunk size 1 (cyclic): iterations assigned to threads in round-robin fashion with `#pragma omp parallel for schedule(static, 1)`
     14 - static scheduling with chunk size n
     15 - dynamic scheduling: loop divided into chunks of n iterations, chunks dynamically assigned to threads on demand with `#pragma omp parallel for schedule(dynamic, n)`
     16     - requires additional synchronisation, more overhead
     17     - allows for dynamic load distribution
     18 
     19 
     20 chunk size selection:
     21 - small means good load balancing, high sync overhead
     22 - large reduce overhead, but poor load balancing
     23 
     24 guided scheduling:
     25 - at start, large chunks so overhead is small (initial chunk size implementation dependant)
     26 - when approaching final barrier, small chunks to balance workload (decreases exponentially with every assignment)
     27 - chunks dynamically assigned to threads on demand
     28 - `#pragma omp parallel for schedule(guided, <n>)`
     29 
     30 runtime scheduling:
     31 - choose scheduling at runtime
     32 - `#pragma omp parallel for schedule(runtime)`
     33 
     34 
     35 how do you choose a scheduling technique?
     36 - depends on code
     37 - is the amount of computational work per iteration roughly the same for each iteration?
     38     - static is preferable if yes
     39         - block cyclic scheduling may be useful for regular uneven workload distributions
     40     - dynamic preferable for irregular, guided is usually superior