lecture-4.md (2084B)
1 +++ 2 title = 'Lecture 4' 3 +++ 4 5 <!-- TODO: finish --> 6 problem: parallel execution incurs overhead (creation of worker threads, scheduling, waiting at sync barrier). so overhead must be outweighed by sufficient workload, i.e. loop body and trip count. 7 8 conditional parallelisation uses if clause: `#pragma omp parallel for if (len >= 1000)`. so you parallelise only at some threshold. 9 10 loop scheduling determines which iterations execute on which thread, aim is to distribute workload equally 11 - can use `#pragma omp parallel for schedule(<type> [, <chunk>])`, which selects out of set of scheduling techniques 12 - static/block scheduling: loop subdivided into as many chunks as threads with `#pragma omp parallel for schedule(static)` 13 - static scheduling with chunk size 1 (cyclic): iterations assigned to threads in round-robin fashion with `#pragma omp parallel for schedule(static, 1)` 14 - static scheduling with chunk size n 15 - dynamic scheduling: loop divided into chunks of n iterations, chunks dynamically assigned to threads on demand with `#pragma omp parallel for schedule(dynamic, n)` 16 - requires additional synchronisation, more overhead 17 - allows for dynamic load distribution 18 19 20 chunk size selection: 21 - small means good load balancing, high sync overhead 22 - large reduce overhead, but poor load balancing 23 24 guided scheduling: 25 - at start, large chunks so overhead is small (initial chunk size implementation dependant) 26 - when approaching final barrier, small chunks to balance workload (decreases exponentially with every assignment) 27 - chunks dynamically assigned to threads on demand 28 - `#pragma omp parallel for schedule(guided, <n>)` 29 30 runtime scheduling: 31 - choose scheduling at runtime 32 - `#pragma omp parallel for schedule(runtime)` 33 34 35 how do you choose a scheduling technique? 36 - depends on code 37 - is the amount of computational work per iteration roughly the same for each iteration? 38 - static is preferable if yes 39 - block cyclic scheduling may be useful for regular uneven workload distributions 40 - dynamic preferable for irregular, guided is usually superior