commit efadfbb8fff4678dc4092369f9a9cd2b1df8f996 parent 904682d678b77515f3ee5f64da317e142cdee41e Author: Alex Balgavy <alex@balgavy.eu> Date: Wed, 9 Jun 2021 18:10:33 +0200 Move OS notes to this site Diffstat:
114 files changed, 1444 insertions(+), 355 deletions(-)
diff --git a/content/_index.md b/content/_index.md @@ -28,7 +28,7 @@ title = "Alex's university course notes" * [Data Structures & Algorithms](dsa-notes/) * [Statistical Methods](stats-notes) -* [Operating Systems](https://thezeroalpha.github.io/os-notes) +* [Operating Systems](os-notes) * [Intelligent Systems](is-notes/) * [Linear Algebra](lin-algebra-notes/) * [Software Design](https://thezeroalpha.github.io/softdesign-notes) diff --git a/content/os-notes/A2 Allocator.html b/content/os-notes/A2 Allocator.html @@ -1,3 +0,0 @@ -<?xml version="1.0" encoding="UTF-8"?> -<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> -<html><head><link rel="stylesheet" href="sitewide.css"><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/><meta name="exporter-version" content="Evernote Mac 7.6 (457297)"/><meta name="altitude" content="-1.684619069099426"/><meta name="author" content="Alex Balgavy"/><meta name="created" content="2018-12-06 18:33:59 +0000"/><meta name="latitude" content="52.33329560521331"/><meta name="longitude" content="4.866571327957702"/><meta name="source" content="desktop.mac"/><meta name="updated" content="2018-12-09 12:34:11 +0000"/><title>A2 Allocator</title></head><body><div>= Basic info =</div><div>malloc — allocate memory of given size</div><div>free — deallocate memory</div><div>realloc — extend/shrink previously allocated</div><div>calloc — zero-initialised memory</div><div><br/></div><div>first layer:</div><div>allocator manages virtual memory region (heap)</div><div>if free block is available in heap, block is returned</div><div>sometimes run out of space on heap, need more. need to tell OS that more virtual memory is needed — ask OS for more virtual memory using brk() (‘please extend the heap’). mmap() (‘create new virtual region in the address space’) is a generalised brk(). munmap() is undo for mmap().</div><div>every time you need memory, the mmu will try to translate and get page fault, which is done with page fault handler.</div><div>virtual memory regions are basically contract between OS and application on legal ranges of memory that we can access</div><div><br/></div><div>demand paging:</div><ol><li><div>Program calls brk() to grow its heap</div></li><li><div>brk() enlarges virtual memory address space. new pages are not mapped onto physical memory.</div></li><li><div>Program tries to access new memory, processor page faults.</div></li><li><div>Kernel assigns page frame to process, creates page table entry, and resumes execution. Program lives happily ever after in userland.</div></li></ol><div><br/></div><div><span style="font-weight: bold;">assignment</span></div><div>allocate user space allocator. gets memory from kernel, divides that, etc.</div><div><br/></div><div>void *malloc(size_t size) — allocate size bytes. alignment should be 8 bytes.</div><div>void free(void *ptr) — free object starting at ptr</div><div>void *calloc(size_t nmemb, size_t size) — allocate nmem * size bytes, zeroed out</div><div>void *realloc(void *ptr, size_t size) — grow object starting at ptr to size bytes, optionally moving allocation if needed.</div><div><br/></div><div><br/></div><div><span style="text-decoration: underline;">requesting memory from kernel</span></div><div>nmap: allocate arbitrary virtual memory areas</div><div>brk/sbrk: grow/shrink heap area</div><div><br/></div><div>int brk(void *addr) — set program break (end of heap) to addr (if reasonable)</div><div>void *sbrk(intptr_t increment) — grow/shrink heap by increment bytes. returns old brk address.</div><div><br/></div><div><span style="text-decoration: underline;">getting started</span></div><div>void *mymalloc(size_t size) {</div><div> return sbrk(size);</div><div>}</div><div><br/></div><div>problems:</div><ul><li><div>size unaligned. should be aligned to 8 bytes, so have to add some alignment. minimum of 8 bytes.</div></li><li><div>free function is empty. it will run out of memory. how implement free function?</div></li><ul><li><div>determine size of ptr. how big is the object?</div></li><li><div>mark area as free. then malloc can search free areas first.</div></li></ul></ul><div><br/></div><div><span style="text-decoration: underline;">design options</span></div><div>in-band list metadata</div><div><img src="A2%20Allocator.resources/screenshot.png" height="273" width="545"/><br/></div><div>each node contains metadata (size, next, prev, is_free) and space for the object.</div><div>when freeing, merge with adjacent free areas.</div><div>avoid fragmentation — non-contiguous free areas.</div><div><br/></div><div><span style="text-decoration: underline;">list metadata example</span></div><div>struct obj_metadata {</div><div> size_t size;</div><div> struct obj_metadata *next;</div><div> struct obj_metadata *prev;</div><div> int is_free;</div><div>};</div><div><br/></div><div>don’t use artificial limit on max number of objects. should have pointer to heap_start, and pointer to freelist (both void). dynamically grow. would give penalty.</div><div><br/></div><div><span style="text-decoration: underline;">Grading</span></div><div>1 point — showing up (valid submission)</div><div>1 point — malloc</div><div>0.5 pnt — calloc</div><div>2 point — free + reuse</div><div>1 point — realloc (+only when needed)</div><div><br/></div><div>1 point — batch brk: preallocate e.g. 4K of memory, only brk when 4K is full</div><div>2 point — reduce overhead per allocation (brk size vs malloc’d size)</div><div>0.5 pnt — locality (new allocations use least recently freed slots)</div><div>1 point — return memory to kernel (give negative to sbrk). return reasonable sized (e.g. 4K) chunk at end of heap if became free.</div><div>1 point — out-of-band metadata. e.g. slab allocator. this often excludes other points.</div><div>2 point — system malloc. implement malloc without prefix. use with LD_PRELOAD env var. tests run ls, grep, python.</div><div><br/></div></body></html>- \ No newline at end of file diff --git a/content/os-notes/A2 Allocator.resources/screenshot.png b/content/os-notes/A2 Allocator/e7a83a5bb7b8d2b79b58bb44f1ea936c.png Binary files differ. diff --git a/content/os-notes/A2 Allocator/index.md b/content/os-notes/A2 Allocator/index.md @@ -0,0 +1,93 @@ ++++ +title = 'A2 Allocator' ++++ +# A2 Allocator + +## Basic info +- malloc — allocate memory of given size +- free — deallocate memory +- realloc — extend/shrink previously allocated +- calloc — zero-initialised memory + +first layer: +- allocator manages virtual memory region (heap) +- if free block is available in heap, block is returned + +sometimes run out of space on heap, need more. need to tell OS that more virtual memory is needed — ask OS for more virtual memory using brk() (‘please extend the heap’). mmap() (‘create new virtual region in the address space’) is a generalised brk(). munmap() is undo for mmap(). + +every time you need memory, the mmu will try to translate and get page fault, which is done with page fault handler. + +virtual memory regions are basically contract between OS and application on legal ranges of memory that we can access + +demand paging: +1. Program calls brk() to grow its heap + +2. brk() enlarges virtual memory address space. new pages are not mapped onto physical memory. + +3. Program tries to access new memory, processor page faults. + +4. Kernel assigns page frame to process, creates page table entry, and resumes execution. Program lives happily ever after in userland. + +## Assignment +allocate user space allocator. gets memory from kernel, divides that, etc. + +- `void *malloc(size_t size)` — allocate size bytes. alignment should be 8 bytes. +- `void free(void *ptr)` — free object starting at ptr +- `void *calloc(size_t nmemb, size_t size)` — allocate nmem * size bytes, zeroed out +- `void *realloc(void *ptr, size_t size)` — grow object starting at ptr to size bytes, optionally moving allocation if needed. + +requesting memory from kernel +- nmap: allocate arbitrary virtual memory areas +- brk/sbrk: grow/shrink heap area + - `int brk(void *addr)` — set program break (end of heap) to addr (if reasonable) + - `void *sbrk(intptr_t increment)` — grow/shrink heap by increment bytes. returns old brk address. + +getting started + +```c +void *mymalloc(size_t size) { + return sbrk(size); +} +``` + +problems: + +- size unaligned. should be aligned to 8 bytes, so have to add some alignment. minimum of 8 bytes. +- free function is empty. it will run out of memory. how implement free function? + - determine size of ptr. how big is the object? + - mark area as free. then malloc can search free areas first. + +design options +in-band list metadata +![screenshot.png](e7a83a5bb7b8d2b79b58bb44f1ea936c.png) + +each node contains metadata (size, next, prev, is_free) and space for the object. + +when freeing, merge with adjacent free areas. +avoid fragmentation — non-contiguous free areas. + +list metadata example + +```c +struct obj_metadata { + size_t size; + struct obj_metadata *next; + struct obj_metadata *prev; + int is_free; +}; +``` + +don’t use artificial limit on max number of objects. should have pointer to heap_start, and pointer to freelist (both void). dynamically grow. would give penalty. + +Grading +- 1 point — showing up (valid submission) +- 1 point — malloc +- 0.5 pnt — calloc +- 2 point — free + reuse +- 1 point — realloc (+only when needed) +- 1 point — batch brk: preallocate e.g. 4K of memory, only brk when 4K is full +- 2 point — reduce overhead per allocation (brk size vs malloc’d size) +- 0.5 pnt — locality (new allocations use least recently freed slots) +- 1 point — return memory to kernel (give negative to sbrk). return reasonable sized (e.g. 4K) chunk at end of heap if became free. +- 1 point — out-of-band metadata. e.g. slab allocator. this often excludes other points. +- 2 point — system malloc. implement malloc without prefix. use with LD_PRELOAD env var. tests run ls, grep, python. diff --git a/content/os-notes/Basics of memory: no abstraction.html b/content/os-notes/Basics of memory: no abstraction.html @@ -1,4 +0,0 @@ -<?xml version="1.0" encoding="UTF-8"?> -<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> -<html><head><link rel="stylesheet" href="sitewide.css"><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/><meta name="exporter-version" content="Evernote Mac 7.6 (457297)"/><meta name="altitude" content="-4.211421489715576"/><meta name="author" content="Alex Balgavy"/><meta name="created" content="2018-12-18 23:06:14 +0000"/><meta name="latitude" content="52.30036414031362"/><meta name="longitude" content="4.98811279251146"/><meta name="source" content="desktop.mac"/><meta name="updated" content="2018-12-18 23:06:25 +0000"/><title>Basics of memory: no abstraction</title></head><body><div style="-en-paragraph:true;"><span style="-en-paragraph:true;">memory address space: addresses from first byte to last byte. in physical memory, it’s physical memory address space.</span></div><div style="-en-paragraph:true;"><br/></div><div style="-en-paragraph:true;"><span style="-en-paragraph:true;">no memory abstraction: monoprogramming</span></div><div style="-en-paragraph:true;"><span style="-en-paragraph:true;">options of arranging memory address space:</span></div><div style="-en-paragraph:true;"><br/></div><div style="-en-paragraph:true;"><img src="Basics%20of%20memory%3A%20no%20abstraction.resources/screenshot_2.png" height="290" width="587"/></div><div style="-en-paragraph:true;"><span style="-en-paragraph:true;"> </span></div><div style="-en-paragraph:true;"><span style="-en-paragraph:true;">no memory abstraction: multiprogramming</span></div><ul><li><div>naive approach: move each program to a dedicated region:</div><div><img src="Basics%20of%20memory%3A%20no%20abstraction.resources/screenshot_1.png" height="245" width="268"/></div></li><ul><li><div>problems: -</div></li><ul><li><div>relocation - you have to move to a completely different address.</div></li><ul><li><div>load-time relocation option — as soon as program is loaded, patch all addresses with new ones</div></li></ul><li><div>protection - both programs live together in memory. one program could overwrite the other</div></li><ul><li><div>memory keys option — current memory protection key gets checked against memory protection key of address</div></li></ul></ul></ul></ul><div style="margin-left: 200px;"><img src="Basics%20of%20memory%3A%20no%20abstraction.resources/screenshot.png" height="240" width="282"/></div><div><br/></div></body></html>- \ No newline at end of file diff --git a/content/os-notes/Basics of memory: no abstraction.resources/screenshot_1.png b/content/os-notes/Basics of memory_ no abstraction/458fb8144ed8fbbd4f8529f5aa8aa32a.png Binary files differ. diff --git a/content/os-notes/Basics of memory: no abstraction.resources/screenshot_2.png b/content/os-notes/Basics of memory_ no abstraction/cc06de8ddf6b063f1bea853621c4d835.png Binary files differ. diff --git a/content/os-notes/Basics of memory: no abstraction.resources/screenshot.png b/content/os-notes/Basics of memory_ no abstraction/d7fc124e438d318abe731701af147149.png Binary files differ. diff --git a/content/os-notes/Basics of memory_ no abstraction/index.md b/content/os-notes/Basics of memory_ no abstraction/index.md @@ -0,0 +1,25 @@ ++++ +title = 'Basics of memory: no abstraction' ++++ +# Basics of memory: no abstraction +memory address space: addresses from first byte to last byte. in physical memory, it’s physical memory address space. + +no memory abstraction: monoprogramming + +options of arranging memory address space: + +![screenshot.png](cc06de8ddf6b063f1bea853621c4d835.png) + +no memory abstraction: multiprogramming + +- naive approach: move each program to a dedicated region: + + ![screenshot.png](458fb8144ed8fbbd4f8529f5aa8aa32a.png) + + - problems: + - relocation - you have to move to a completely different address. + - load-time relocation option — as soon as program is loaded, patch all addresses with new ones + - protection - both programs live together in memory. one program could overwrite the other + - memory keys option — current memory protection key gets checked against memory protection key of address + +![screenshot.png](d7fc124e438d318abe731701af147149.png) diff --git a/content/os-notes/Deadlocks.html b/content/os-notes/Deadlocks.html @@ -1,3 +0,0 @@ -<?xml version="1.0" encoding="UTF-8"?> -<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> -<html><head><link rel="stylesheet" href="sitewide.css"><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/><meta name="exporter-version" content="Evernote Mac 7.6 (457297)"/><meta name="altitude" content="-1.222272038459778"/><meta name="author" content="Alex Balgavy"/><meta name="created" content="2018-12-13 12:59:03 +0000"/><meta name="latitude" content="52.33297299596081"/><meta name="longitude" content="4.864358938126112"/><meta name="source" content="desktop.mac"/><meta name="updated" content="2018-12-19 00:29:52 +0000"/><title>Deadlocks</title></head><body><div><span style="font-weight: bold;">Deadlocks</span></div><div>A set of processes is deadlocked if each process in the set is waiting for an event that only another process in the set can cause.</div><div><br/></div><div>Starvation: nobody gets resources?</div><div>Livelock: many instructions executed but no progress</div><div><br/></div><div>Conditions:</div><ul><li><div>Mutual exclusion: each resource is assigned to at most one process</div></li><li><div>Hold & wait: processes can request resource when holding another one</div></li><li><div>No preemption: resources are not preemptable, i.e. can’t be taken away from a process</div></li><li><div>Circular wait: chain of at least two processes must be waiting for a resource held by next process in the chain</div></li></ul><div><br/></div><div><img src="Deadlocks.resources/screenshot_1.png" height="371" width="651"/><img src="Deadlocks.resources/screenshot_2.png" height="437" width="531"/><br/></div><div><br/></div><div>Handling:</div><ul><li><div>ignore the problem — no action taken</div></li><li><div>deadlock detection — detect and perform recovery</div></li><li><div>deadlock prevention — prevent any of deadlock conditions</div></li><li><div>deadlock avoidance — allocate resources to avoid deadlocks</div></li></ul><div><br/></div><div><span style="font-weight: bold;">Ignore the problem</span></div><div>become an ostrich and imagine the problem doesn’t exist.</div><div>assumes deadlocks are rare, the cost of handling them is high, and their effects are ok.</div><div>in practice, only last resort.</div><div><br/></div><div><span style="font-weight: bold;">Deadlock detection</span></div><div>can be used when simple & efficient detection mechanisms are available.</div><div><br/></div><div>detection</div><ul><li><div>check for cycles in the resource allocation graph</div></li><li><div>track progress, time out if necessary</div></li><li><div>detect explicitly if e.g. OOM</div></li></ul><div><br/></div><div>recovery</div><ul><li><div>force preemption</div></li><li><div>checkpoint-rollback</div></li><li><div>kill the offending processes</div></li></ul><div><br/></div><div>in practice, solution of choice when adequate detection/recovery mechanisms are available</div><div><br/></div><div><span style="font-weight: bold;">Deadlock prevention</span></div><div>mutual exclusion:</div><ul><li><div>spool everything</div></li><li><div>but this usually shifts problem somewhere else</div></li></ul><div>hold & wait</div><ul><li><div>request all resources initially</div></li><li><div>but this has poor parallelism and resource utilisation</div></li></ul><div>no preemption:</div><ul><li><div>take resources away</div></li><li><div>but not applicable in a lot of cases</div></li></ul><div>circular wait:</div><ul><li><div>order resources numerically</div></li><li><div>but it’s hard to consistently enforce in practice</div></li></ul><div><br/></div><div>in practice, adopted in particular domains (e.g. two-phase locking in transaction processing systems)</div><div><br/></div><div><span style="font-weight: bold;">Deadlock avoidance</span></div><div>single resource: Banker’s algorithm (Dijkstra)</div><ul><li><div>customers (processes) request credits (resources)</div></li><li><div>banker (OS) only satisfies requests resulting in safe states</div></li><li><div>a state is safe if there exists a sequence of other states that allows all customers to complete</div></li><li><div>max credit demands are known in advance</div></li></ul><div><br/></div><div>multiple resources:</div><ul><li><div>generalised safe state detection:</div></li></ul><div style="margin-left: 40px;"><img src="Deadlocks.resources/screenshot.png" height="251" width="556"/><br/></div><ul><ul><li><div>select row R whose unmet resource needs N are all <= A</div></li><li><div>mark R as terminated and add its resources to A vector</div></li><li><div>repeat until completion (safe) or deadlock (unsafe)</div></li></ul></ul><div><br/></div><div>in practice, rarely an option (hard to determine resource needs in advance)</div></body></html>- \ No newline at end of file diff --git a/content/os-notes/Deadlocks.resources/screenshot.png b/content/os-notes/Deadlocks/6d1576bb6afe5fa2751e8663ef0b1e79.png Binary files differ. diff --git a/content/os-notes/Deadlocks.resources/screenshot_1.png b/content/os-notes/Deadlocks/a7241241701e69e8d77915481bf9cc40.png Binary files differ. diff --git a/content/os-notes/Deadlocks.resources/screenshot_2.png b/content/os-notes/Deadlocks/c4a7c1a80885366e5cdc38a9aca3fb38.png Binary files differ. diff --git a/content/os-notes/Deadlocks/index.md b/content/os-notes/Deadlocks/index.md @@ -0,0 +1,92 @@ ++++ +title = 'Deadlocks' ++++ +# Deadlocks + +A set of processes is deadlocked if each process in the set is waiting for an event that only another process in the set can cause. + +Starvation: nobody gets resources? +Livelock: many instructions executed but no progress + +Conditions: + +- Mutual exclusion: each resource is assigned to at most one process +- Hold & wait: processes can request resource when holding another one +- No preemption: resources are not preemptable, i.e. can’t be taken away from a process +- Circular wait: chain of at least two processes must be waiting for a resource held by next process in the chain + +![screenshot.png](a7241241701e69e8d77915481bf9cc40.png)![screenshot.png](c4a7c1a80885366e5cdc38a9aca3fb38.png) + +Handling: + +- ignore the problem — no action taken +- deadlock detection — detect and perform recovery +- deadlock prevention — prevent any of deadlock conditions +- deadlock avoidance — allocate resources to avoid deadlocks + +## Ignore the problem +become an ostrich and imagine the problem doesn’t exist. + +assumes deadlocks are rare, the cost of handling them is high, and their effects are ok. + +in practice, only last resort. + +## Deadlock detection +can be used when simple & efficient detection mechanisms are available. + +detection + +- check for cycles in the resource allocation graph +- track progress, time out if necessary +- detect explicitly if e.g. OOM + +recovery + +- force preemption +- checkpoint-rollback +- kill the offending processes + +in practice, solution of choice when adequate detection/recovery mechanisms are available + +## Deadlock prevention +mutual exclusion: + +- spool everything +- but this usually shifts problem somewhere else + +hold & wait + +- request all resources initially +- but this has poor parallelism and resource utilisation + +no preemption: + +- take resources away +- but not applicable in a lot of cases + +circular wait: + +- order resources numerically +- but it’s hard to consistently enforce in practice + +in practice, adopted in particular domains (e.g. two-phase locking in transaction processing systems) + +## Deadlock avoidance +single resource: Banker’s algorithm (Dijkstra) + +- customers (processes) request credits (resources) +- banker (OS) only satisfies requests resulting in safe states +- a state is safe if there exists a sequence of other states that allows all customers to complete +- max credit demands are known in advance + +multiple resources: + +- generalised safe state detection: + +![screenshot.png](6d1576bb6afe5fa2751e8663ef0b1e79.png) + +- select row R whose unmet resource needs N are all <= A +- mark R as terminated and add its resources to A vector +- repeat until completion (safe) or deadlock (unsafe) + +in practice, rarely an option (hard to determine resource needs in advance) diff --git a/content/os-notes/Design issues.html b/content/os-notes/Design issues.html @@ -1,17 +0,0 @@ -<?xml version="1.0" encoding="UTF-8"?> -<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> -<html><head><link rel="stylesheet" href="sitewide.css"><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/><meta name="exporter-version" content="Evernote Mac 7.6 (457297)"/><meta name="altitude" content="-4.208046913146973"/><meta name="author" content="Alex Balgavy"/><meta name="created" content="2018-12-18 23:51:55 +0000"/><meta name="latitude" content="52.3003418234608"/><meta name="longitude" content="4.988162359396394"/><meta name="source" content="desktop.mac"/><meta name="updated" content="2018-12-18 23:54:08 +0000"/><title>Design issues</title></head><body><div><span style="font-weight: bold;">Allocation policies: -</span></div><ul><li><div>local allocation: -</div></li><ul><li><div>replaces only pages of current process</div></li><li><div>assumes static process memory allocation size</div></li><li><div>problems: -</div></li><ul><li><div>growing <span style="font-style: italic;">working set</span> leads to <span style="font-style: italic;">trashing</span></div></li><li><div>shrinking working set leads to wasted memory</div></li></ul></ul><li><div>global allocation: -</div></li><ul><li><div>replaces pages owned by any process</div></li><li><div>strategies: -</div></li><ul><li><div>treat every global page equally, or prioritize</div></li><li><div>dynamic memory balancing using working set size estimation</div></li><li><div>selectively swap out processes, i.e. hurt process efficiency since it's not as important (or OOM kill)</div></li></ul></ul></ul><div><br/></div><div><span style="font-weight: bold;">Working set model: -</span></div><ul><li><div>working set estimation: -</div></li><ul><li><div>aging-based</div></li><li><div>scanning-based: like aging, but without temporal dimension. divide time in discrete slots, scan all reference bits in page table entries, decide how many slots</div></li><li><div>active lists (e.g. WSCLOCK): amount of pages in working set is known, want to decide which set of pages is in working set. active list of n elements (size of working set) which contains active pages</div></li><li><div>applications: memory prepaging, page replacement, checkpoint-restore, live migration</div></li></ul><li><div>working set size estimation: -</div></li><ul><li><div>sampling-based</div></li><li><div>monitoring-based (e.g. page fault freq)</div></li><li><div>Miss Rate Curves (MRC)</div></li><li><div>applications: dynamic memory balancing, garbage collection, WS estimation</div></li></ul></ul><div><br/></div><div><span style="font-weight: bold;">Cleaning policy: -</span></div><ul><li><div>paging works well if there are a lot of free pages</div></li><li><div>direct reclaim: when you have the need to allocate new page, you directly reclaim an existing page</div></li><li><div>indirect reclaim: paging daemon sleeps and then evict pages if free pages are in short supply (don't wait until the last minute)</div></li><li><div>another way is to leave them in memory, easy reuse</div></li></ul><div><br/></div><div><span style="font-weight: bold;">Virtual memory interface: -</span></div><ul><li><div>allocator interface: malloc family, mmap family</div></li><li><div>memory-mapped files -</div></li><ul><li><div>treat files as memory objects for faster IO</div></li><li><div>facilitate code sharing for programs/shared libs</div></li></ul><li><div>copy-on-write semantics -</div></li><ul><li><div>lazy copying for fork(), deduplication, checkpointing, etc.</div></li></ul><li><div>shared memory -</div></li><ul><li><div>MAP_SHARED mappings, key-based, file-based</div></li></ul><li><div>distributed shared memory -</div></li><ul><li><div>shared memory semantics over network</div></li></ul></ul><div><br/></div></body></html>- \ No newline at end of file diff --git a/content/os-notes/Design issues.md b/content/os-notes/Design issues.md @@ -0,0 +1,51 @@ ++++ +title = 'Design issues' ++++ +# Design issues +## Allocation policies: + +- local allocation: + - replaces only pages of current process + - assumes static process memory allocation size + - problems: + - growing *working set* leads to *trashing* + - shrinking working set leads to wasted memory +- global allocation: + - replaces pages owned by any process + - strategies: + - treat every global page equally, or prioritize + - dynamic memory balancing using working set size estimation + - selectively swap out processes, i.e. hurt process efficiency since it's not as important (or OOM kill) + +## Working set model: + +- working set estimation: + - aging-based + - scanning-based: like aging, but without temporal dimension. divide time in discrete slots, scan all reference bits in page table entries, decide how many slots + - active lists (e.g. WSCLOCK): amount of pages in working set is known, want to decide which set of pages is in working set. active list of n elements (size of working set) which contains active pages + - applications: memory prepaging, page replacement, checkpoint-restore, live migration +- working set size estimation: + - sampling-based + - monitoring-based (e.g. page fault freq) + - Miss Rate Curves (MRC) + - applications: dynamic memory balancing, garbage collection, WS estimation + +## Cleaning policy: + +- paging works well if there are a lot of free pages +- direct reclaim: when you have the need to allocate new page, you directly reclaim an existing page +- indirect reclaim: paging daemon sleeps and then evict pages if free pages are in short supply (don't wait until the last minute) +- another way is to leave them in memory, easy reuse + +## Virtual memory interface: + +- allocator interface: malloc family, mmap family +- memory-mapped files + - treat files as memory objects for faster IO + - facilitate code sharing for programs/shared libs +- copy-on-write semantics + - lazy copying for fork(), deduplication, checkpointing, etc. +- shared memory + - MAP_SHARED mappings, key-based, file-based +- distributed shared memory + - shared memory semantics over network diff --git a/content/os-notes/Devices.html b/content/os-notes/Devices.html @@ -1,3 +0,0 @@ -<?xml version="1.0" encoding="UTF-8"?> -<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> -<html><head><link rel="stylesheet" href="sitewide.css"><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/><meta name="exporter-version" content="Evernote Mac 7.6 (457297)"/><meta name="altitude" content="-4.200446128845215"/><meta name="author" content="Alex Balgavy"/><meta name="created" content="2018-12-19 00:27:51 +0000"/><meta name="latitude" content="52.30035400390625"/><meta name="longitude" content="4.988122679323205"/><meta name="source" content="desktop.mac"/><meta name="updated" content="2018-12-19 00:28:07 +0000"/><title>Devices</title></head><body><div>Solid-State Drive (SSD)<br/></div><ul><li><div>based on DRAM or NAND/NOR flash memory. no mechanical parts.</div></li><li><div>NAND flash memory</div></li><ul><li><div>organisation: cells, pages, blocks</div></li><li><div>controller can read/program by page</div></li><li><div>controller can erase by block</div></li><li><div>random/sequential access latency is comparable, but more efficient at reading than writing.</div></li></ul></ul><div><br/></div><div>Hard Disk (HDD)</div><ul><li><div>platters and cylinders in a disk, as many tracks as heads. each track has N sectors</div></li><li><div>addressing modes:</div></li><ul><li><div>CHS (cylinder-head-sector), virtual/physical</div></li><li><div>LBA (logical block addressing), only virtual</div></li></ul><li><div>in order to transfer data, you have to seek, rotate, and actually transfer. this adds up.</div></li><li><div>so how do you optimise it?</div></li><ul><li><div>aaand we’re back to scheduling. this time we’re scheduling the disk arm.</div></li><ul><li><div>first come first serve: do requests in order of arrival. but that’s not actually an optimisation.</div></li><li><div>shortest seek first: do requests with a shortest seek time. but that has to be calculated.</div></li><li><div>elevator algorithm: go up the disk, then down the disk, then back up the disk…do requests along the way.</div></li></ul></ul><li><div>error handling</div></li><ul><li><div>programming errors (if you’re requesting a nonexistent sector)</div></li><li><div>transient errors (e.g. shit getting on the head of the arm)</div></li><li><div>permanent errors (if a disk block is physically damaged)</div></li><li><div>seek errors (the arm went to the wrong place)</div></li><li><div>controller errors (controller is a lil bitch and stops accepting commands)</div></li><li><div>disk track with bad sector</div></li><ul><li><div>in this case you can remap</div></li><li><div>switch spare for bad sector, or shift all sectors to bypass the bad one</div></li></ul></ul></ul><div><br/></div><div>Clock</div><ul><li><div>Hardware function</div></li><ul><li><div>simple clocks send hardware interrupts every voltage cycle of power supply</div></li><li><div>advanced clocks have own crystal oscillator which decrements a counter in a register. send hardware interrupt every time the counter is 0</div></li></ul><li><div>Software function</div></li><ul><li><div>Maintains time of day (real time) — system boot time (backup cock) + uptime (ticks)</div></li><li><div>Stops processes from running too long — on every tick, decrement current process’ quantum</div></li><li><div>Accounts for CPU usage — on every tick, increment current process’ CPU time</div></li><li><div>Handle alarm syscall — on every tick, decrement alarm counter</div></li><li><div>Give the system watchdog/software timers — generate system notifications for synchronous alarms</div></li><li><div>Profile, monitor, gather statistics — on every tick, increment some set of counters</div></li></ul></ul><div/></body></html>- \ No newline at end of file diff --git a/content/os-notes/Devices.md b/content/os-notes/Devices.md @@ -0,0 +1,47 @@ ++++ +title = 'Devices' ++++ +# Devices +Solid-State Drive (SSD) + +- based on DRAM or NAND/NOR flash memory. no mechanical parts. +- NAND flash memory + - organisation: cells, pages, blocks + - controller can read/program by page + - controller can erase by block + - random/sequential access latency is comparable, but more efficient at reading than writing. + +Hard Disk (HDD) + +- platters and cylinders in a disk, as many tracks as heads. each track has N sectors +- addressing modes: + - CHS (cylinder-head-sector), virtual/physical + - LBA (logical block addressing), only virtual +- in order to transfer data, you have to seek, rotate, and actually transfer. this adds up. +- so how do you optimise it? + - aaand we’re back to scheduling. this time we’re scheduling the disk arm. + - first come first serve: do requests in order of arrival. but that’s not actually an optimisation. + - shortest seek first: do requests with a shortest seek time. but that has to be calculated. + - elevator algorithm: go up the disk, then down the disk, then back up the disk…do requests along the way. +- error handling + - programming errors (if you’re requesting a nonexistent sector) + - transient errors (e.g. shit getting on the head of the arm) + - permanent errors (if a disk block is physically damaged) + - seek errors (the arm went to the wrong place) + - controller errors (controller is a lil bitch and stops accepting commands) + - disk track with bad sector + - in this case you can remap + - switch spare for bad sector, or shift all sectors to bypass the bad one + +Clock + +- Hardware function + - simple clocks send hardware interrupts every voltage cycle of power supply + - advanced clocks have own crystal oscillator which decrements a counter in a register. send hardware interrupt every time the counter is 0 +- Software function + - Maintains time of day (real time) — system boot time (backup cock) + uptime (ticks) + - Stops processes from running too long — on every tick, decrement current process’ quantum + - Accounts for CPU usage — on every tick, increment current process’ CPU time + - Handle alarm syscall — on every tick, decrement alarm counter + - Give the system watchdog/software timers — generate system notifications for synchronous alarms + - Profile, monitor, gather statistics — on every tick, increment some set of counters diff --git a/content/os-notes/File system layout.html b/content/os-notes/File system layout.html @@ -1,14 +0,0 @@ -<?xml version="1.0" encoding="UTF-8"?> -<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> -<html><head><link rel="stylesheet" href="sitewide.css"><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/><meta name="exporter-version" content="Evernote Mac 7.6 (457297)"/><meta name="altitude" content="-4.349451065063477"/><meta name="author" content="Alex Balgavy"/><meta name="created" content="2018-12-19 00:03:10 +0000"/><meta name="latitude" content="52.30032348632812"/><meta name="longitude" content="4.98812990790774"/><meta name="source" content="desktop.mac"/><meta name="updated" content="2018-12-19 00:03:52 +0000"/><title>File system layout</title></head><body><div style="">on real disk, there are number of partitions (primary, extended, subpartitions). each partition can hold different filesystem<br/></div><div style=""><br/></div><div style=""><img src="File%20system%20layout.resources/B75E558A-A498-407B-A14B-95A64A57BDEE.png" height="235" width="538"/></div><div><br/></div><div>one partition is flagged as 'active' partition</div><div>MBR has boot code to locate active primary partition and execute boot block</div><div><br/></div><div>partitions: primary, extended, and subpartitions</div><div><br/></div><div><span style="font-weight: bold;">how to store files on disk: -</span></div><div>contiguous allocation: -</div><ul><li><div>files as contiguous stream of bytes</div></li><li><div>but prone to fragmentation, requires max file size</div></li></ul><div><br/></div><div>block-based (logically contiguous, but actually stored anywhere on disk): -</div><ul><li><div>linked list -</div></li><ul><li><div>by keeping in-band metadata, end up with strange block size (2 or 4 kb data, with some bytes to hold pointer)</div></li><li><div>good for sequential access patterns, but random...not so much</div></li><li><div>pointers are pointing to blocks on disk (NOT memory)</div></li><li><div><img src="File%20system%20layout.resources/40D5993F-B020-40C1-9824-EBBC03D73BBD.png" height="373" width="515"/></div></li></ul><li><div>file allocation table (FAT) -</div></li><ul><li><div>move in-band metadata out of band</div></li><li><div>the table is stored in kernel memory</div></li><li><div>holds starting points of files</div></li><li><div>every row refers to a block of memory</div></li><li><div>worked well for a while, but for modern sizes it is not possible to hold the table in memory</div></li><li><div><img src="File%20system%20layout.resources/AE00280E-4C0C-4F43-9A0C-D23CFBB31755.png" height="377" width="337"/></div></li></ul><li><div>I-nodes -</div></li><ul><li><div>just a FUCKTON of indirections</div></li><li><div><img src="File%20system%20layout.resources/283F1231-AF87-4602-BECA-6737A8A4275E.png" height="477" width="776"/></div></li></ul></ul><div><br/></div><div><span style="font-weight: bold;">How to implement directories -</span></div><div><img src="File%20system%20layout.resources/46733DBC-ADF2-4E9B-8AA6-CA1D57187394.png" height="189" width="839"/></div><div> <img src="File%20system%20layout.resources/EBD4A225-FAB2-45DD-843E-E8C8CD94786F.png" height="178" width="835"/></div><div><br/></div><div>FAT layout: -</div><ul><li><div>PBR stores "boot block"</div></li><li><div>reserved is later used for filesystem info</div></li><li><div>FAT #: preallocated file allocation table</div></li><li><div>preallocated root dir (FAT12/FAT16)</div></li><li><div>clusters are addressable blocks (FAT has chains indexed by starting cluster)</div></li><li><div><img src="File%20system%20layout.resources/A60F5F9E-2CC1-4B68-8CD0-B08E43BB0F5F.png" height="133" width="848"/></div></li></ul><div>UNIX dir entry: -</div><ul><li><div>dir entry stores only name and inode number</div></li><li><div>attributes are stored in inode</div></li><li><div>one inode per file, first inode is root</div></li><li><div>links: -</div></li><ul><li><div>hard link -- if 10 hard links to a file, then a single file with 10 dir entries pointing to that file (inode)</div></li><li><div>soft (symbolic) link -- if 10 soft links to a file, literally 10 files. can point to anything, but if file gets deleted, the link is dead.</div></li></ul><li><div>where are the inode stored? it depends, as always.</div></li></ul><div><br/></div><div style="margin-left: 40px;"> <img src="File%20system%20layout.resources/F73C52E4-0455-469F-A7F0-2CEAD32AAE54.png" height="286" width="831"/></div><div style="margin-left: 40px;"><img src="File%20system%20layout.resources/6CB2B618-CF63-4A55-AFE2-B295F06E7390.png" height="212" width="428"/></div><div style="margin-left: 40px;"><img src="File%20system%20layout.resources/7826DCF8-7F53-49E1-976F-BCAF52D8674A.png" height="509" width="948"/></div><div><br/></div><div>how to manage disk space -</div><ul><li><div>linked list vs bitmap (linked list actually works really well here)</div></li></ul><div style="margin-left: 40px;"><img src="File%20system%20layout.resources/31DF59DF-579C-4326-ABD5-D5A340CB77E1.png" height="532" width="806"/></div><div><br/></div></body></html>- \ No newline at end of file diff --git a/content/os-notes/File system layout.resources/EBD4A225-FAB2-45DD-843E-E8C8CD94786F.png b/content/os-notes/File system layout/0d30522d012c79b891b3120ed9c2f8b3.png Binary files differ. diff --git a/content/os-notes/File system layout.resources/6CB2B618-CF63-4A55-AFE2-B295F06E7390.png b/content/os-notes/File system layout/0db1d8b55f787cdbab5a378ac5038ad0.png Binary files differ. diff --git a/content/os-notes/File system layout.resources/F73C52E4-0455-469F-A7F0-2CEAD32AAE54.png b/content/os-notes/File system layout/1ce3484b2773ce6c1c6283327e880159.png Binary files differ. diff --git a/content/os-notes/File system layout.resources/40D5993F-B020-40C1-9824-EBBC03D73BBD.png b/content/os-notes/File system layout/4cec0a62f74e7cce20c9970d024cf93b.png Binary files differ. diff --git a/content/os-notes/File system layout.resources/31DF59DF-579C-4326-ABD5-D5A340CB77E1.png b/content/os-notes/File system layout/4ea0e1f1384708f15599dc2e8ac8c858.png Binary files differ. diff --git a/content/os-notes/File system layout.resources/A60F5F9E-2CC1-4B68-8CD0-B08E43BB0F5F.png b/content/os-notes/File system layout/555ca3a94b042070cc8360acf834fd98.png Binary files differ. diff --git a/content/os-notes/File system layout.resources/B75E558A-A498-407B-A14B-95A64A57BDEE.png b/content/os-notes/File system layout/680ce86101ebb0f5559e3c485512297d.png Binary files differ. diff --git a/content/os-notes/File system layout.resources/AE00280E-4C0C-4F43-9A0C-D23CFBB31755.png b/content/os-notes/File system layout/9574c3243ae1aec42c16a87811967973.png Binary files differ. diff --git a/content/os-notes/File system layout.resources/283F1231-AF87-4602-BECA-6737A8A4275E.png b/content/os-notes/File system layout/989ec9c9f66e3d39587fe99eefeeb1f6.png Binary files differ. diff --git a/content/os-notes/File system layout.resources/46733DBC-ADF2-4E9B-8AA6-CA1D57187394.png b/content/os-notes/File system layout/d6e774bac420b1585709a558b8aba44f.png Binary files differ. diff --git a/content/os-notes/File system layout.resources/7826DCF8-7F53-49E1-976F-BCAF52D8674A.png b/content/os-notes/File system layout/fc97a31fa9f8ded5eaa7caf1ad4720dc.png Binary files differ. diff --git a/content/os-notes/File system layout/index.md b/content/os-notes/File system layout/index.md @@ -0,0 +1,70 @@ ++++ +title = 'File system layout' ++++ +# File system layout +on real disk, there are number of partitions (primary, extended, subpartitions). each partition can hold different filesystem + +![](680ce86101ebb0f5559e3c485512297d.png) + +one partition is flagged as 'active' partition. +MBR has boot code to locate active primary partition and execute boot block + +partitions: primary, extended, and subpartitions + +## How to store files on disk + +contiguous allocation: + +- files as contiguous stream of bytes +- but prone to fragmentation, requires max file size + +block-based (logically contiguous, but actually stored anywhere on disk): + +- linked list + - by keeping in-band metadata, end up with strange block size (2 or 4 kb data, with some bytes to hold pointer) + - good for sequential access patterns, but random...not so much + - pointers are pointing to blocks on disk (NOT memory) + - ![](4cec0a62f74e7cce20c9970d024cf93b.png) +- file allocation table (FAT) + - move in-band metadata out of band + - the table is stored in kernel memory + - holds starting points of files + - every row refers to a block of memory + - worked well for a while, but for modern sizes it is not possible to hold the table in memory + - ![](9574c3243ae1aec42c16a87811967973.png) +- I-nodes + - just a FUCKTON of indirections + - ![](989ec9c9f66e3d39587fe99eefeeb1f6.png) + +## How to implement directories +![](d6e774bac420b1585709a558b8aba44f.png) + ![](0d30522d012c79b891b3120ed9c2f8b3.png) + +FAT layout: + +- PBR stores "boot block" +- reserved is later used for filesystem info +- FAT #: preallocated file allocation table +- preallocated root dir (FAT12/FAT16) +- clusters are addressable blocks (FAT has chains indexed by starting cluster) +- ![](555ca3a94b042070cc8360acf834fd98.png) + +UNIX dir entry: + +- dir entry stores only name and inode number +- attributes are stored in inode +- one inode per file, first inode is root +- links: + - hard link -- if 10 hard links to a file, then a single file with 10 dir entries pointing to that file (inode) + - soft (symbolic) link -- if 10 soft links to a file, literally 10 files. can point to anything, but if file gets deleted, the link is dead. +- where are the inode stored? it depends, as always. + + ![](1ce3484b2773ce6c1c6283327e880159.png) +![](0db1d8b55f787cdbab5a378ac5038ad0.png) +![](fc97a31fa9f8ded5eaa7caf1ad4720dc.png) + +how to manage disk space + +- linked list vs bitmap (linked list actually works really well here) + +![](4ea0e1f1384708f15599dc2e8ac8c858.png) diff --git a/content/os-notes/Files.html b/content/os-notes/Files.html @@ -1,12 +0,0 @@ -<?xml version="1.0" encoding="UTF-8"?> -<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> -<html><head><link rel="stylesheet" href="sitewide.css"><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/><meta name="exporter-version" content="Evernote Mac 7.6 (457297)"/><meta name="altitude" content="-4.349451065063477"/><meta name="author" content="Alex Balgavy"/><meta name="created" content="2018-12-19 00:02:33 +0000"/><meta name="latitude" content="52.30032348632812"/><meta name="longitude" content="4.98812990790774"/><meta name="source" content="desktop.mac"/><meta name="updated" content="2018-12-19 00:02:57 +0000"/><title>Files</title></head><body><div>abstract storage 'devices' organised in (typically hierarchical) file system structure</div><div><br/></div><div>file access: -</div><ul><li><div>sequential (in order) vs. random access (e.g. second, then first, then third block)</div></li></ul><div>file types: -</div><ul><li><div>regular files, dirs, soft links</div></li><li><div>special files (e.g. device files, metadata files)</div></li></ul><div>file structure: -</div><ul><li><div>OS' perspective: files as streams of bytes</div></li><li><div>program's perspective: archives, executables, etc.</div></li><li><div>is OS ever aware of the file structure?</div></li></ul><div>file naming: -</div><ul><li><div>different file systems have different limitations/conventions for filenames</div></li><li><div>file extensions</div></li><li><div>file name length: -</div></li><ul><li><div>FAT: 8.3 characters (8 name, 3 extension. later extended to 255)</div></li><li><div>EXT4: 255 characters</div></li></ul><li><div>Special chars in filenames: -</div></li><ul><li><div>FAT: no "*/:<>?\| etc.</div></li><li><div>EXT4: no '\0' and '/' or special names '.' and '..'</div></li></ul><li><div>case sensitivity depends</div></li></ul><div>file attributes: -</div><ul><li><div>there's a whole bunch of them</div></li></ul><div><img src="Files.resources/C8443068-DC83-472E-B42F-1AEDC7CDD460.png" height="454" width="471"/></div><div>file operations: -</div><ul><li><div>create/delete</div></li><li><div>open/close</div></li><li><div>read/write</div></li><li><div>append</div></li><li><div>seek</div></li><li><div>get attributes/set attributes</div></li><li><div>rename</div></li></ul><div><br/></div><div>directories: -</div><ul><li><div>persistent data structures organising and maintaining info about files</div></li><li><div>file attrs stored in-band or out-of-band (both commonly used)</div></li></ul><div><img src="Files.resources/FD883BA4-6FD1-4ECC-B909-761D3A37B214.png" height="222" width="555"/></div><div><img src="Files.resources/350FD665-B09B-4CFA-B09B-0B3B02E7D9E0.png" height="324" width="691"/></div><div><br/></div></body></html>- \ No newline at end of file diff --git a/content/os-notes/Files.resources/FD883BA4-6FD1-4ECC-B909-761D3A37B214.png b/content/os-notes/Files/3830a931cb0e4b33017ef22e25f71c96.png Binary files differ. diff --git a/content/os-notes/Files.resources/350FD665-B09B-4CFA-B09B-0B3B02E7D9E0.png b/content/os-notes/Files/7e47431605b7501bcca744b7f1330c8b.png Binary files differ. diff --git a/content/os-notes/Files.resources/C8443068-DC83-472E-B42F-1AEDC7CDD460.png b/content/os-notes/Files/e803342509f8f738d741afcea618c4cc.png Binary files differ. diff --git a/content/os-notes/Files/index.md b/content/os-notes/Files/index.md @@ -0,0 +1,55 @@ ++++ +title = 'Files' ++++ +# Files +abstract storage 'devices' organised in (typically hierarchical) file system structure + +file access: + +- sequential (in order) vs. random access (e.g. second, then first, then third block) + +file types: + +- regular files, dirs, soft links +- special files (e.g. device files, metadata files) + +file structure: + +- OS' perspective: files as streams of bytes +- program's perspective: archives, executables, etc. +- is OS ever aware of the file structure? + +file naming: + +- different file systems have different limitations/conventions for filenames +- file extensions +- file name length: + - FAT: 8.3 characters (8 name, 3 extension. later extended to 255) + - EXT4: 255 characters +- Special chars in filenames: + - FAT: no "\*/:<>?\| etc. + - EXT4: no '\0' and '/' or special names '.' and '..' +- case sensitivity depends + +file attributes: + +- there's a whole bunch of them + +![](e803342509f8f738d741afcea618c4cc.png) +file operations: + +- create/delete +- open/close +- read/write +- append +- seek +- get attributes/set attributes +- rename + +directories: + +- persistent data structures organising and maintaining info about files +- file attrs stored in-band or out-of-band (both commonly used) + +![](3830a931cb0e4b33017ef22e25f71c96.png) +![](7e47431605b7501bcca744b7f1330c8b.png) diff --git a/content/os-notes/Files_1.html b/content/os-notes/Files_1.html @@ -1,3 +0,0 @@ -<?xml version="1.0" encoding="UTF-8"?> -<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> -<html><head><link rel="stylesheet" href="sitewide.css"><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/><meta name="exporter-version" content="Evernote Mac 7.6 (457297)"/><meta name="altitude" content="-1.755801558494568"/><meta name="author" content="Alex Balgavy"/><meta name="created" content="2018-12-03 15:33:58 +0000"/><meta name="latitude" content="52.3333740234375"/><meta name="longitude" content="4.866627909046412"/><meta name="source" content="desktop.mac"/><meta name="updated" content="2018-12-03 15:34:22 +0000"/><title>Files</title></head><body><div><span style="font-weight: bold;">Files:</span></div><ul><li><div>abstraction of storage device (possibly real)</div></li><li><div>can read/write</div></li><li><div>starts at root dir “/"</div></li><li><div>absolute (/Users/user/Documents) and relative (../Documents) paths</div></li></ul><div><br/></div><div>File permissions:</div><ul><li><div>throuh bit permission tuples (read-write-execute)</div></li><li><div>x for directories is something like ‘cd'</div></li></ul><div><br/></div><div>Special files:</div><ul><li><div>everything is a file (descriptor)</div></li><li><div>hardware: block (disks), character (serial port)</div></li><li><div>symlinks to link to other files</div></li><li><div>named/anonymous FIFOS (sockets/pipes)</div></li></ul><div><br/></div><div>Pipes:</div><ul><li><div>pseudofiles for processes to communicate over FIFO channel</div></li><li><div>has to be set up in advance</div></li><li><div>looks like a “normal” file</div></li></ul><div><br/></div></body></html>- \ No newline at end of file diff --git a/content/os-notes/Interrupt Handling & Scheduling.html b/content/os-notes/Interrupt Handling & Scheduling.html @@ -1,4 +0,0 @@ -<?xml version="1.0" encoding="UTF-8"?> -<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> -<html><head><link rel="stylesheet" href="sitewide.css"><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/><meta name="exporter-version" content="Evernote Mac 7.6 (457297)"/><meta name="altitude" content="-1.633376002311707"/><meta name="author" content="Alex Balgavy"/><meta name="created" content="2018-12-03 16:03:43 +0000"/><meta name="latitude" content="52.33331298828125"/><meta name="longitude" content="4.866615317820796"/><meta name="source" content="desktop.mac"/><meta name="updated" content="2018-12-14 15:37:33 +0000"/><title>Interrupt Handling & Scheduling</title></head><body><div><span style="font-weight: bold;">interrupt handling:</span></div><ul><li><div>deallocate CPU and give it to the scheduler. we rely on hardware-provided interrupt handling support (like a notification).</div></li><li><div>allows scheduler to periodically get control whenever hardware generates an interrupt</div></li><li><div>interrupt vector: -</div></li><ul><li><div>associated with each IO device and interrupt line</div></li><li><div>part of Interrupt descriptor table (IDT)</div></li><li><div>contains start address of OS-provided internal procedure</div></li></ul><li><div>interrupt handler continues execution</div></li><li><div>interrupt types: </div></li><ul><li><div>software: synchronous e.g. interrupt (int $0x80)</div></li><li><div>hardware device: asynchronous, e.g. exceptions</div></li></ul></ul><div style="-en-paragraph:true;"><span style="-en-paragraph:true;">The scheduler gets control every time an interrupt occurs. It acts as a mediator.</span></div><div style="margin-top: 1em;margin-bottom: 1em;-en-paragraph:true;"/><div style="-en-paragraph:true;"><span style="-en-paragraph:true;">What happens?</span></div><ol><li><div>Hardware puts program counter etc. on the stack</div></li><li><div>Hardware loads new program counter from interrupt vector</div></li><li><div>Assembly language procedure saves registers</div></li><li><div>Assembly language procedure sets up new stack</div></li><li><div>C interrupt service runs (e.g. reads and buffers input)</div></li><li><div>Scheduler decides which process is to run next.</div></li><li><div>C procedure returns to assembly code.</div></li><li><div>Assembly language procedure starts up new current process</div></li></ol><div><br/></div><div style="-en-paragraph:true;"><span style="-en-paragraph:true;">A process can't give the CPU to another process (i.e. do a context switch) without going through the scheduler.</span></div></body></html>- \ No newline at end of file diff --git a/content/os-notes/Interrupt Handling & Scheduling.md b/content/os-notes/Interrupt Handling & Scheduling.md @@ -0,0 +1,30 @@ ++++ +title = 'Interrupt Handling & Scheduling' ++++ +# Interrupt Handling & Scheduling + +interrupt handling: +- deallocate CPU and give it to the scheduler. we rely on hardware-provided interrupt handling support (like a notification). +- allows scheduler to periodically get control whenever hardware generates an interrupt +- interrupt vector: + - associated with each IO device and interrupt line + - part of Interrupt descriptor table (IDT) + - contains start address of OS-provided internal procedure +- interrupt handler continues execution +- interrupt types: + - software: synchronous e.g. interrupt (int \$0x80) + - hardware device: asynchronous, e.g. exceptions + +The scheduler gets control every time an interrupt occurs. It acts as a mediator. + +What happens? +1. Hardware puts program counter etc. on the stack +2. Hardware loads new program counter from interrupt vector +3. Assembly language procedure saves registers +4. Assembly language procedure sets up new stack +5. C interrupt service runs (e.g. reads and buffers input) +6. Scheduler decides which process is to run next. +7. C procedure returns to assembly code. +8. Assembly language procedure starts up new current process + +A process can't give the CPU to another process (i.e. do a context switch) without going through the scheduler. diff --git a/content/os-notes/Kernels.html b/content/os-notes/Kernels.html @@ -1,4 +0,0 @@ -<?xml version="1.0" encoding="UTF-8"?> -<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> -<html><head><link rel="stylesheet" href="sitewide.css"><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/><meta name="exporter-version" content="Evernote Mac 7.6 (457297)"/><meta name="altitude" content="-0.2267845571041107"/><meta name="author" content="Alex Balgavy"/><meta name="created" content="2018-11-01 13:43:06 +0000"/><meta name="latitude" content="52.33450351828412"/><meta name="longitude" content="4.866719213400291"/><meta name="source" content="desktop.mac"/><meta name="updated" content="2018-12-03 15:33:19 +0000"/><title>Kernels</title></head><body><div>the type of kernel that you use, and the OS architecture, depends on the application</div><div><br/></div><div><span style="font-size: 21px; font-weight: bold;">Monolithic kernels</span></div><ul><li><div>main program invokes syscall</div></li><li><div>kernel is underlying monolithic block:</div></li><ul><li><div>service procedures carry out syscalls</div></li><li><div>utility procedures help implement service procedures</div><div><img src="Kernels.resources/screenshot_1.png" height="245" width="545"/></div></li></ul><li><div>separate applications and OS using privilege levels into user and kernel</div></li><ul><li><div>on x86, 4 privilege levels (but in practice mostly 2 are used)</div></li><li><div>this is supported by the hardware directly</div></li><li><div>if only goal is to separate untrustworthy apps from lower level shit, you only need 2 separate levels</div></li><li><div>if you include more levels, there’s a cost associated with switching between levels, so why do it if it’s not needed</div><div><img src="Kernels.resources/screenshot_2.png" height="481" width="573"/></div></li></ul></ul><div><br/></div><div><span style="font-size: 21px; font-weight: bold;">Virtualisation</span></div><ul><li><div>originally to separate multiprogramming from extended machine</div></li><li><div>N independent system call interfaces</div><div><img src="Kernels.resources/screenshot.png" height="162" width="553"/></div></li><li><div>Virtual machine monitor (VMM/Hypervisor) emulates hardware</div></li><li><div>types:</div></li><ul><li><div>1: VMM runs on bare metal (like Xen)</div><div><img src="Kernels.resources/screenshot_5.png" height="198" width="123"/></div></li><li><div>2: VMM hosted on OS (like QEMU)</div><div><img src="Kernels.resources/screenshot_6.png" height="206" width="184"/></div></li><li><div>Hybrid: VMM inside OS (like KVM)</div></li></ul></ul><div><br/></div><h2>Exokernel</h2><ul><li><div>separate resource control from extended machine</div></li><li><div>unlike VMM/Hypervisor, it: -</div></li><ul><li><div>does not emulate hardware. only resource manager</div></li><li><div>only provides <span style="font-style: italic;">safe</span> low-level resource sharing</div></li><li><div>service procedures are offered as library linked directly to application -- "Library OS"</div></li></ul><li><div>different library OSes for different programs, allows application-level specialisation</div></li></ul><div><br/></div><h2>Client/server model (microkernel)</h2><ul><li><div>organise service procedures in programs running in separate processes (system services/drivers)</div></li><li><div>high level of isolation</div></li><li><div>processes communicate via message passing</div></li><li><div>calls rely on the same mechanism (message passing)</div></li><li><div>messaging is implemented in microkernel (minimal kernel)</div></li><li><div>principle of least privilege -- isolate every service in its own domain (address space, process, etc.)</div></li><li><div>this is more secure, but lower performance (always a tradeoff). have to switch between modes and shit</div></li></ul><div><br/></div><div style="margin-left: 40px;"><img src="Kernels.resources/screenshot_4.png" height="158" width="536"/></div><h2><br/></h2><h2>Microvisor<br/></h2><ul><li><div>combination of hypervisor and microkernel</div></li><li><div>different OS architectures have different design points, people look at convergence and tradeoffs</div></li></ul><div style="margin-left: 40px;"><br/></div><div style="margin-left: 40px;"><img src="Kernels.resources/screenshot_3.png" height="348" width="328"/></div><div><br/></div><h2>Unikernel</h2><ul><li><div>"single simple application implementing whatever in the cloud, in most efficient way"</div></li><li><div>squash application and OS kernel into one thing, don't need all of the other stuff like process management</div></li><li><div>gets rid of all of the overhead</div></li></ul><div><br/></div></body></html>- \ No newline at end of file diff --git a/content/os-notes/Kernels.resources/screenshot_1.png b/content/os-notes/Kernels/0e624a52ade3b11abb07f9a24a963169.png Binary files differ. diff --git a/content/os-notes/Kernels.resources/screenshot_6.png b/content/os-notes/Kernels/2b9ac9aed621fe6db99275ca232c4ac2.png Binary files differ. diff --git a/content/os-notes/Kernels.resources/screenshot.png b/content/os-notes/Kernels/4b168ee0b75e0b83e6a2836382ad2698.png Binary files differ. diff --git a/content/os-notes/Kernels.resources/screenshot_2.png b/content/os-notes/Kernels/55b6743c4ad50038449a7386842f601b.png Binary files differ. diff --git a/content/os-notes/Kernels.resources/screenshot_5.png b/content/os-notes/Kernels/c4b6fb9769649f44cf95457e873af74d.png Binary files differ. diff --git a/content/os-notes/Kernels.resources/screenshot_4.png b/content/os-notes/Kernels/c67b727b70fbbe38dcdbbec5ab520c95.png Binary files differ. diff --git a/content/os-notes/Kernels.resources/screenshot_3.png b/content/os-notes/Kernels/d9c66553585d53ea07bf5a4bfc141b28.png Binary files differ. diff --git a/content/os-notes/Kernels/index.md b/content/os-notes/Kernels/index.md @@ -0,0 +1,75 @@ ++++ +title = 'Kernels' ++++ +# Kernels +the type of kernel that you use, and the OS architecture, depends on the application + +## Monolithic kernels + +- main program invokes syscall +- kernel is underlying monolithic block: + - service procedures carry out syscalls + - utility procedures help implement service procedures + +![screenshot.png](0e624a52ade3b11abb07f9a24a963169.png) + +- separate applications and OS using privilege levels into user and kernel + - on x86, 4 privilege levels (but in practice mostly 2 are used) + - this is supported by the hardware directly + - if only goal is to separate untrustworthy apps from lower level shit, you only need 2 separate levels + - if you include more levels, there’s a cost associated with switching between levels, so why do it if it’s not needed + +![screenshot.png](55b6743c4ad50038449a7386842f601b.png) + +## Virtualisation + +- originally to separate multiprogramming from extended machine +- N independent system call interfaces + +![screenshot.png](4b168ee0b75e0b83e6a2836382ad2698.png) + +- Virtual machine monitor (VMM/Hypervisor) emulates hardware +- types: + - 1: VMM runs on bare metal (like Xen) + + ![screenshot.png](c4b6fb9769649f44cf95457e873af74d.png) + + - 2: VMM hosted on OS (like QEMU) + + ![screenshot.png](2b9ac9aed621fe6db99275ca232c4ac2.png) + + - Hybrid: VMM inside OS (like KVM) + +## Exokernel + +- separate resource control from extended machine +- unlike VMM/Hypervisor, it: + - does not emulate hardware. only resource manager + - only provides *safe* low-level resource sharing + - service procedures are offered as library linked directly to application -- "Library OS" +- different library OSes for different programs, allows application-level specialisation + +## Client/server model (microkernel) + +- organise service procedures in programs running in separate processes (system services/drivers) +- high level of isolation +- processes communicate via message passing +- calls rely on the same mechanism (message passing) +- messaging is implemented in microkernel (minimal kernel) +- principle of least privilege -- isolate every service in its own domain (address space, process, etc.) +- this is more secure, but lower performance (always a tradeoff). have to switch between modes and shit + +![screenshot.png](c67b727b70fbbe38dcdbbec5ab520c95.png) + +## Microvisor + +- combination of hypervisor and microkernel +- different OS architectures have different design points, people look at convergence and tradeoffs + +![screenshot.png](d9c66553585d53ea07bf5a4bfc141b28.png) + +## Unikernel + +- "single simple application implementing whatever in the cloud, in most efficient way" +- squash application and OS kernel into one thing, don't need all of the other stuff like process management +- gets rid of all of the overhead diff --git a/content/os-notes/Principles of IO hardware.html b/content/os-notes/Principles of IO hardware.html @@ -1,12 +0,0 @@ -<?xml version="1.0" encoding="UTF-8"?> -<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> -<html><head><link rel="stylesheet" href="sitewide.css"><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/><meta name="exporter-version" content="Evernote Mac 7.6 (457297)"/><meta name="altitude" content="-0.7028496265411377"/><meta name="author" content="Alex Balgavy"/><meta name="created" content="2018-12-13 13:50:34 +0000"/><meta name="latitude" content="52.33294677734375"/><meta name="longitude" content="4.864368295366129"/><meta name="source" content="desktop.mac"/><meta name="updated" content="2018-12-19 00:27:28 +0000"/><title>Principles of IO hardware</title></head><body><div style="-en-paragraph:true;">IO devices available to software via an interface</div><ul><li><div style="-en-paragraph:true;">character devices: all IO occurs as <span style="text-decoration: underline;">stream</span> of bytes</div></li><li><div style="-en-paragraph:true;">block devices: all IO occurs in units of <span style="text-decoration: underline;">randomly accessible</span> blocks</div></li></ul><div style="-en-paragraph:true;"><br/></div><div style="-en-paragraph:true;">device controller</div><ul><li><div style="-en-paragraph:true;">located between actual device and computer</div></li><li><div style="-en-paragraph:true;">offers electronic interface via IO registers</div></li><li><div style="-en-paragraph:true;">R/W those registers to ask controller to perform actions</div></li></ul><div style="-en-paragraph:true;"><br/></div><div style="-en-paragraph:true;">example: parallel port</div><div style="-en-paragraph:true;"><br/></div><div style="-en-paragraph:true;"><img src="Principles%20of%20IO%20hardware.resources/screenshot.png" height="383" width="411"/></div><div style="-en-paragraph:true;"><br/></div><div style="-en-paragraph:true;"><br/></div><div style="-en-paragraph:true;">accessing IO registers</div><ul><li><div>port-mapped IO (PMIO): -</div></li><ul><li><div>IO registers are accessed via dedicated port numbers and special instructions</div></li><li><div>e.g. inb instruction</div></li></ul><li><div>memory-mapped IO (MMIO): -</div></li><ul><li><div>IO registers are mapped into address of main memory</div></li><li><div>can be accessed with a simple mov</div></li></ul><li><div>intel x86 is hybrid, does both PMIO and MMIO</div></li></ul><div style="-en-paragraph:true;"><br/></div><div style="-en-paragraph:true;">IO ports have a specification from the manufacturer (e.g. IBM PC)</div><div style="-en-paragraph:true;"><br/></div><div style="-en-paragraph:true;">waiting for requests:</div><ul><li><div>we can send commands to devices, but what if requested operation takes time?</div></li><li><div>polling: -</div></li><ul><li><div>most devices offer status bits in registers to signal that request has been finished (incl. error code)</div></li><li><div>OS can poll this status bit ("<span style="font-style: italic;">polling</span>")</div></li><li><div>is this a good solution? not a general purpose solution, but good if we know it'll complete in a short amount of time.</div></li></ul><li><div>interrupts: -</div></li><ul><li><div>device controller can trigger interrupts to signal that IO requesst is complete -</div></li><ul><li><div>for device, means changing voltage on an electrical line</div></li></ul><li><div>each controller has interrupt vector assigned -</div></li><ul><li><div>CPU runs vector-specific handler when interrupt occurs</div></li></ul><li><div>process one interrupt at a time</div></li></ul></ul><div style="-en-paragraph:true;">data exchange between device and CPU</div><ul><li><div>how do we transfer data from hard disk to memory? -</div></li><ul><li><div>program disk controller to read sector</div></li><li><div>wait for interrupt</div></li><li><div>read a device register sizeof(sector) consecutive times</div></li><li><div>repeat for next sector</div></li></ul><li><div>problem? yes sir. CPU cycles can be spent in a better way</div></li><li><div>so just let hardware do the copying -- Direct Memory Access! -</div></li><ul><li><div>on ISA systems, there was a dedicated DMA controller (third-party)</div></li><li><div>on PCI (and PCIe) systems each PCI device can become "Bus Master" and perform DMA (first-party DMA) -</div></li><ul><li><div>device and DMA controller are combined</div></li><li><div>you have to trust your hardware (or use an IO MMU)</div></li></ul><li><div>embedded systems still have dedicated DMA controller</div></li><li><div>disk controller still uses own buffers</div></li></ul></ul><div><h1><span style="font-size: 28px;"><b><br/></b></span></h1></div><div><br/></div></body></html>- \ No newline at end of file diff --git a/content/os-notes/Principles of IO hardware.resources/screenshot.png b/content/os-notes/Principles of IO hardware/be015fe1d43f1123b2e111bd1bc8f36c.png Binary files differ. diff --git a/content/os-notes/Principles of IO hardware/index.md b/content/os-notes/Principles of IO hardware/index.md @@ -0,0 +1,60 @@ ++++ +title = 'Principles of IO hardware' ++++ +# Principles of IO hardware +IO devices available to software via an interface + +- character devices: all IO occurs as stream of bytes +- block devices: all IO occurs in units of randomly accessible blocks + +device controller + +- located between actual device and computer +- offers electronic interface via IO registers +- R/W those registers to ask controller to perform actions + +example: parallel port + +![screenshot.png](be015fe1d43f1123b2e111bd1bc8f36c.png) + +accessing IO registers + +- port-mapped IO (PMIO): + - IO registers are accessed via dedicated port numbers and special instructions + - e.g. inb instruction +- memory-mapped IO (MMIO): + - IO registers are mapped into address of main memory + - can be accessed with a simple mov +- intel x86 is hybrid, does both PMIO and MMIO + +IO ports have a specification from the manufacturer (e.g. IBM PC) + +waiting for requests: + +- we can send commands to devices, but what if requested operation takes time? +- polling: + - most devices offer status bits in registers to signal that request has been finished (incl. error code) + - OS can poll this status bit ("*polling*") + - is this a good solution? not a general purpose solution, but good if we know it'll complete in a short amount of time. +- interrupts: + - device controller can trigger interrupts to signal that IO requesst is complete + - for device, means changing voltage on an electrical line + - each controller has interrupt vector assigned + - CPU runs vector-specific handler when interrupt occurs + - process one interrupt at a time + +data exchange between device and CPU + +- how do we transfer data from hard disk to memory? + - program disk controller to read sector + - wait for interrupt + - read a device register sizeof(sector) consecutive times + - repeat for next sector +- problem? yes sir. CPU cycles can be spent in a better way +- so just let hardware do the copying -- Direct Memory Access! + - on ISA systems, there was a dedicated DMA controller (third-party) + - on PCI (and PCIe) systems each PCI device can become "Bus Master" and perform DMA (first-party DMA) + - device and DMA controller are combined + - you have to trust your hardware (or use an IO MMU) + - embedded systems still have dedicated DMA controller + - disk controller still uses own buffers diff --git a/content/os-notes/Principles of IO software.html b/content/os-notes/Principles of IO software.html @@ -1,3 +0,0 @@ -<?xml version="1.0" encoding="UTF-8"?> -<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> -<html><head><link rel="stylesheet" href="sitewide.css"><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/><meta name="exporter-version" content="Evernote Mac 7.6 (457297)"/><meta name="altitude" content="-4.208271026611328"/><meta name="author" content="Alex Balgavy"/><meta name="created" content="2018-12-19 00:27:28 +0000"/><meta name="latitude" content="52.30035507272927"/><meta name="longitude" content="4.988170104808015"/><meta name="source" content="desktop.mac"/><meta name="updated" content="2018-12-19 00:27:51 +0000"/><title>Principles of IO software</title></head><body><div style="-en-paragraph:true;"><span style="-en-paragraph:true;">the concepts:</span></div><ul><li><div style="-en-paragraph:true;"><span style="-en-paragraph:true;">device independence:</span></div></li><ul><li><div>IO software provides abstraction over actual hardware</div></li><li><div>programs shouldn't have to care about device particularities</div></li></ul><li><div style="-en-paragraph:true;"><span style="-en-paragraph:true;">uniform naming:</span></div></li><ul><li><div style="-en-paragraph:true;"><span style="-en-paragraph:true;">because if you have to type a unique 20 letter id to access a device you’ll give up</span></div></li></ul><li><div style="-en-paragraph:true;"><span style="-en-paragraph:true;">error handling:</span></div></li><ul><li><div>errors should be handled closest to their source so we don’t have to give a crap</div></li></ul><li><div style="-en-paragraph:true;"><span style="-en-paragraph:true;">synchronous vs asynchronous IO:</span></div></li><ul><li><div>programs don't want to deal with interrupts.</div></li><li><div>so OS turns async operations into blocking operations</div></li><li><div>then lower levels have to deal with interrupts</div></li></ul><li><div style="-en-paragraph:true;"><span style="-en-paragraph:true;">buffering:</span></div></li><ul><li><div>networking: incoming packets have to be buffered</div></li><li><div>audio: buffering to avoid clicks</div></li></ul></ul><div style="-en-paragraph:true;"><br/></div><div style="-en-paragraph:true;"><span style="-en-paragraph:true;">the layers in practice:</span></div><div style="-en-paragraph:true;"><br/></div><div style="-en-paragraph:true;"><img src="Principles%20of%20IO%20software.resources/screenshot.png" height="294" width="653"/></div><div style="-en-paragraph:true;"><br/></div><div style="-en-paragraph:true;"><br/></div><div style="-en-paragraph:true;"><span style="-en-paragraph:true;">interrupt handler</span></div><ul><li><div>device driver doesn't handle low-level interrupt directly</div></li><li><div>lowe-level handler is relatively generic</div></li><li><div>on linux, calls device driver-specific interrupt handler if registered</div></li><li><div>on minix, sends message to device driver that registered for interrupt</div></li></ul><div style="-en-paragraph:true;"><span style="-en-paragraph:true;">device driver:</span></div><ul><li><div>accepts abstract reqs from device-independent layer and transforms them into commands for the device</div></li><li><div>can queue new requests as necessary</div></li><li><div>usually needs to wait for completion of the request (IRQ)</div></li><li><div>checks for errors and answers requests from device-independent layer</div></li></ul><div style="-en-paragraph:true;"><span style="-en-paragraph:true;">device independent IO software:</span></div><ul><li><div>uniform interfacing for device drivers (driver nterface, naming (major/minor numbers), and protection</div></li><li><div>buffering: necessary for both character and block devices</div></li><li><div>error reporting: hardware-level, driver-level, etc.</div></li><li><div>allocing/deallocing dedicated devices (e.g. printers)</div></li><li><div>device-independent block size: unify blocks/characters across devices</div></li></ul><div>user-level IO software</div><ul><li><div>single interface allowing user to access devices</div></li><li><div>in C: fopen, close, fflush, frpintf, and so on</div></li><li><div>safely multiplexes access to exclusive devices</div></li></ul><div><br/></div><div><u><b><br/></b></u></div><div><br/></div></body></html>- \ No newline at end of file diff --git a/content/os-notes/Principles of IO software.resources/screenshot.png b/content/os-notes/Principles of IO software/0c612b134cd53709365f563b566a86ea.png Binary files differ. diff --git a/content/os-notes/Principles of IO software/index.md b/content/os-notes/Principles of IO software/index.md @@ -0,0 +1,52 @@ ++++ +title = 'Principles of IO software' ++++ +# Principles of IO software +the concepts: + +- device independence: + - IO software provides abstraction over actual hardware + - programs shouldn't have to care about device particularities +- uniform naming: + - because if you have to type a unique 20 letter id to access a device you’ll give up +- error handling: + - errors should be handled closest to their source so we don’t have to give a crap +- synchronous vs asynchronous IO: + - programs don't want to deal with interrupts. + - so OS turns async operations into blocking operations + - then lower levels have to deal with interrupts +- buffering: + - networking: incoming packets have to be buffered + - audio: buffering to avoid clicks + +the layers in practice: + +![screenshot.png](0c612b134cd53709365f563b566a86ea.png) + +interrupt handler + +- device driver doesn't handle low-level interrupt directly +- lowe-level handler is relatively generic +- on linux, calls device driver-specific interrupt handler if registered +- on minix, sends message to device driver that registered for interrupt + +device driver: + +- accepts abstract reqs from device-independent layer and transforms them into commands for the device +- can queue new requests as necessary +- usually needs to wait for completion of the request (IRQ) +- checks for errors and answers requests from device-independent layer + +device independent IO software: + +- uniform interfacing for device drivers (driver nterface, naming (major/minor numbers), and protection +- buffering: necessary for both character and block devices +- error reporting: hardware-level, driver-level, etc. +- allocing/deallocing dedicated devices (e.g. printers) +- device-independent block size: unify blocks/characters across devices + +user-level IO software + +- single interface allowing user to access devices +- in C: fopen, close, fflush, frpintf, and so on +- safely multiplexes access to exclusive devices diff --git a/content/os-notes/Process management.html b/content/os-notes/Process management.html @@ -1,3 +0,0 @@ -<?xml version="1.0" encoding="UTF-8"?> -<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> -<html><head><link rel="stylesheet" href="sitewide.css"><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/><meta name="exporter-version" content="Evernote Mac 7.6 (457297)"/><meta name="altitude" content="276.5"/><meta name="author" content="Alex Balgavy"/><meta name="created" content="2018-11-09 12:55:30 +0000"/><meta name="latitude" content="50.11347908434513"/><meta name="longitude" content="14.33735006690641"/><meta name="source" content="desktop.mac"/><meta name="updated" content="2018-11-09 16:33:16 +0000"/><title>Process management</title></head><body><div>Minimal shell:</div><ol><li><div>Wait for command</div></li><li><div>Start process to execute command</div></li><li><div>Wait until process finished</div></li></ol><div><br/></div><div><span style="font-family: "Courier New";">pid_t fork()</span></div><ul><li><div>duplicates current process</div></li><li><div>negative value — unsuccessful</div></li><li><div>zero — returned to newly created child process</div></li><li><div>positive — returned to parent or caller, value is PID of new process</div></li></ul><div><br/></div><div><span style="font-family: "Courier New";">pid_t wait(int *wstatus)</span></div><ul><li><div>waits for child processes to change state</div></li><li><div>writes status to wstatus</div></li></ul><div><br/></div><div><span style="font-family: "Courier New";">int exec(const char *path, char *constargv[])</span></div><ul><li><div>loads new binary from path in current process</div></li><li><div>constargv — program arguments</div></li><li><div>last argument is NULL (e.g. constargv = {“/bin/ls”, “-a”, NULL})</div></li></ul><div><br/></div><div><span style="font-family: "Helvetica Neue";">signals</span></div><ul><li><div>processes may need to be interrupted</div></li><li><div>a signal is sent to process that needs to be interrupted</div></li><li><div>interrupted process catches signal using signal handler</div></li><li><div><font face="Courier New">sighandler_t signal(int signum, sighandler_t handler)</font></div></li><ul><li><div><font face="Helvetica Neue">registers signal handler</font></div></li></ul><li><div><font face="Courier New">unsignedint alarm(unsigned int secs)</font></div></li><ul><li><div><font face="Helvetica Neue">deliver SIGALARM in seconds</font></div></li></ul><li><div><font face="Courier New">int kill(pd_t pid, int sig)</font></div></li><ul><li><div><font face="Helvetica Neue">deliver signal to pid</font></div></li></ul></ul><div><br/></div><div>pipes</div><ul><li><div>pipe(fd) sets up a pipe</div></li><li><div>fd is array of 2 ints</div></li></ul><div><br/></div></body></html>- \ No newline at end of file diff --git a/content/os-notes/Process management.md b/content/os-notes/Process management.md @@ -0,0 +1,43 @@ ++++ +title = 'Process management' ++++ +# Process management +Minimal shell: +1. Wait for command +2. Start process to execute command +3. Wait until process finished + +`pid_t fork()` + +- duplicates current process +- negative value — unsuccessful +- zero — returned to newly created child process +- positive — returned to parent or caller, value is PID of new process + +`pid_t wait(int *wstatus)` + +- waits for child processes to change state +- writes status to wstatus + +`int exec(const char *path, char *constargv[])` + +- loads new binary from path in current process +- constargv — program arguments +- last argument is NULL (e.g. constargv = {“/bin/ls”, “-a”, NULL}) + +signals + +- processes may need to be interrupted +- a signal is sent to process that needs to be interrupted +- interrupted process catches signal using signal handler +- sighandler_t signal(int signum, sighandler_t handler) + - registers signal handler +- unsignedint alarm(unsigned int secs) + - deliver SIGALARM in seconds +- `int kill(pd_t pid, int sig)` + - deliver signal to pid + +pipes + +- `pipe(fd)` sets up a pipe +- fd is array of 2 ints diff --git a/content/os-notes/Process model.html b/content/os-notes/Process model.html @@ -1,10 +0,0 @@ -<?xml version="1.0" encoding="UTF-8"?> -<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> -<html><head><link rel="stylesheet" href="sitewide.css"><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/><meta name="exporter-version" content="Evernote Mac 7.6 (457297)"/><meta name="altitude" content="-4.281962394714355"/><meta name="author" content="Alex Balgavy"/><meta name="created" content="2018-11-15 16:17:05 +0000"/><meta name="latitude" content="52.30028731088095"/><meta name="longitude" content="4.987940925431394"/><meta name="source" content="desktop.mac"/><meta name="updated" content="2018-12-14 15:34:26 +0000"/><title>Process model</title></head><body><div><div><span style="font-weight: bold;">process:</span> program in execution (amount of processes depends on the program)</div><div>it’s an abstraction that allows OS to simplify resource allocation, accounting, and limiting.</div><div><br/></div><div><span style="font-weight: bold;">process table:</span></div><div>OS maintains info on resources and internal state of every process</div><div>information in Process Table: ID (PID), User (UID), Group (GID), memory address space, hw registers, open files, signals, etc.</div><div>process control blocks:</div><div><br/></div><div style="margin-left: 40px;"><img src="Process%20model.resources/screenshot.png" height="344" width="600"/><br/></div><div style="margin-left: 40px;"><br/></div><div><span style="font-weight: bold;">concurrent processes</span></div><div>in principle, multiple processes are mutually independent (they have nothing at all in common). need explicit means to interact with each other.</div><div>the CPU gets allocated to each process in turn</div><ul><li><div>on OS level: save context of process A (program counter, registers, etc.), switch to B. to go back to process A, simply restore context.</div></li></ul><div>OS (normally) offers <span style="font-style: italic;">no timing or ordering guarantees</span></div><div><br/></div><div style="margin-left: 40px;"><img src="Process%20model.resources/49DFF9CF-E995-4151-A222-48E3331BFB4A.png" height="214" width="655"/><br/></div><div style="margin-left: 40px;"><br/></div><div><span style="font-weight: bold;">process hierarchies -</span></div><div>OS creates only 1 init process (usually)</div><div>parent process can create a child process</div><div>results in a tree-like structure and process groups</div><div><br/></div><div><span style="font-weight: bold;">process management</span></div><div>fork: create new process -</div><ul><li><div>child is 'private' clone of parent</div></li><li><div>shares <span style="font-style: italic;">some</span> resources with parent</div></li></ul><div>exec: execute new process image -</div><ul><li><div>used in combination with fork</div></li><li><div>replaces the current command</div></li></ul><div>exit: cause voluntary process termination -</div><ul><li><div>exist status returned to parent process</div></li></ul><div>kill: send signal to process (or group) -</div><ul><li><div>can cause involuntary process termination</div></li></ul><div><br/></div><div><span style="font-weight: bold;">process states -</span></div><div>OS allocates resources to processes</div><div>three process states: -</div><ul><li><div>running: process is currently executed by CPU</div></li><li><div>blocked: process is waiting for available resources</div></li><li><div>ready: process is ready to be selected</div></li></ul><div style="margin-left: 40px;"><br/></div><div style="margin-left: 40px;"><img src="Process%20model.resources/CCF73D0A-6CC9-4590-8DFE-0D72639EC440.png" height="171" width="676"/><br/></div><div style="margin-left: 80px;"><br/></div><div>scheduler allocates/deallocates the CPU. there is no immediate transition between states because process has to wait for scheduler</div><div><br/></div><div><b>scheduler vs processes</b></div><ul><li><div>scheduler periodically switches processes</div></li><li><div>sequential processes lie on the layer above</div></li><li><div>leads to simple process organisation</div></li></ul><div/><ul><div style=""><br/></div><div style=""><img src="Process%20model.resources/38570BCC-D218-4654-AADF-B2FDA78A797C.png" height="216" width="403"/><br/></div></ul></div><div><br/></div></body></html>- \ No newline at end of file diff --git a/content/os-notes/Process model.resources/screenshot.png b/content/os-notes/Process model/23974f8c561b0cddc2bc33aca9f237de.png Binary files differ. diff --git a/content/os-notes/Process model.resources/CCF73D0A-6CC9-4590-8DFE-0D72639EC440.png b/content/os-notes/Process model/6e8f99942767b51c636065cbe3df0d53.png Binary files differ. diff --git a/content/os-notes/Process model.resources/38570BCC-D218-4654-AADF-B2FDA78A797C.png b/content/os-notes/Process model/9ce90ac9631400010ab118e5c8100a8c.png Binary files differ. diff --git a/content/os-notes/Process model.resources/49DFF9CF-E995-4151-A222-48E3331BFB4A.png b/content/os-notes/Process model/f6c6156909b358184921daccb51cdf1c.png Binary files differ. diff --git a/content/os-notes/Process model/index.md b/content/os-notes/Process model/index.md @@ -0,0 +1,72 @@ ++++ +title = 'Process model' ++++ +# Process model +**process:** program in execution (amount of processes depends on the program) + +it’s an abstraction that allows OS to simplify resource allocation, accounting, and limiting. + +## Process table +OS maintains info on resources and internal state of every process + +information in Process Table: ID (PID), User (UID), Group (GID), memory address space, hw registers, open files, signals, etc. + +process control blocks: + +![screenshot.png](23974f8c561b0cddc2bc33aca9f237de.png) + +## Concurrent processes + +in principle, multiple processes are mutually independent (they have nothing at all in common). need explicit means to interact with each other. + +the CPU gets allocated to each process in turn + +- on OS level: save context of process A (program counter, registers, etc.), switch to B. to go back to process A, simply restore context. + +OS (normally) offers *no timing or ordering guarantees* + +![](f6c6156909b358184921daccb51cdf1c.png) + +## Process hierarchies +OS creates only 1 init process (usually) +parent process can create a child process +results in a tree-like structure and process groups + +## Process management +fork: create new process + +- child is 'private' clone of parent +- shares *some* resources with parent + +exec: execute new process image + +- used in combination with fork +- replaces the current command + +exit: cause voluntary process termination + +- exist status returned to parent process + +kill: send signal to process (or group) + +- can cause involuntary process termination + +## Process states +OS allocates resources to processes +three process states: + +- running: process is currently executed by CPU +- blocked: process is waiting for available resources +- ready: process is ready to be selected + +![](6e8f99942767b51c636065cbe3df0d53.png) + +scheduler allocates/deallocates the CPU. there is no immediate transition between states because process has to wait for scheduler + +## Scheduler vs processes + +- scheduler periodically switches processes +- sequential processes lie on the layer above +- leads to simple process organisation + +![](9ce90ac9631400010ab118e5c8100a8c.png) diff --git a/content/os-notes/Race conditions & mutual exclusion.html b/content/os-notes/Race conditions & mutual exclusion.html @@ -1,7 +0,0 @@ -<?xml version="1.0" encoding="UTF-8"?> -<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> -<html><head><link rel="stylesheet" href="sitewide.css"><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/><meta name="exporter-version" content="Evernote Mac 7.6 (457297)"/><meta name="altitude" content="-1.633376002311707"/><meta name="author" content="Alex Balgavy"/><meta name="created" content="2018-12-03 16:07:23 +0000"/><meta name="latitude" content="52.33331298828125"/><meta name="longitude" content="4.866615317820796"/><meta name="source" content="desktop.mac"/><meta name="updated" content="2018-12-14 16:34:17 +0000"/><title>Race conditions & mutual exclusion</title></head><body><div>programs can race each other and come to wrong conclusions.</div><div style="-en-paragraph:true;"><span style="-en-paragraph:true;"><br/></span></div><div style="-en-paragraph:true;"><span style="-en-paragraph:true;"><b>Mutual exclusion</b></span></div><div>critical region: a code region with access to shared resources -</div><div><br/></div><div>the conditions for mutual exclusion:</div><ol><li><div>no two processes can be simultaneously in their critical regions</div></li><li><div>no assumptions can be made about speed/number of CPUs</div></li><li><div>no process running outside critical region may block others</div></li><li><div>no process should have to wait forever to enter critical region</div></li></ol><div><br/></div><div>non-solutions</div><ul><li><div>disable interrupts (disable reallocation of CPU). if you have multiple processors, shit gets fucked.</div></li><li><div>lock variables (guard critical regions with booleans). so now the race is on the lock variables, great job dude.</div></li><li><div>strict alternation (be nice and take turns). doesn’t allow processes to enter critical regions twice in a row, and a process outside the critical region can block another one.</div></li></ul><div><br/></div><div>Peterson's algorithm<br/></div><ul><li><div>software solution</div></li><li><div>spin lock in the while loop</div></li></ul><div><br/></div><div style="box-sizing: border-box; padding: 8px; font-family: Monaco, Menlo, Consolas, "Courier New", monospace; font-size: 12px; color: rgb(51, 51, 51); border-top-left-radius: 4px; border-top-right-radius: 4px; border-bottom-right-radius: 4px; border-bottom-left-radius: 4px; background-color: rgb(251, 250, 248); border: 1px solid rgba(0, 0, 0, 0.14902);-en-codeblock:true;"><div>#define N 2</div><div>int turn;</div><div>int interested[N];</div><div><br/></div><div>void enter_region(int process){</div><div> int other = 1 - process;</div><div> interested[process] = TRUE;</div><div> turn = process;</div><div> while(turn==process && interested[other]==TRUE);</div><div>}</div><div><br/></div><div>void leave_region(int process) { </div><div> interested[process] = FALSE;</div><div>}</div></div><div><br/></div><div>TSL instruction -</div><ul><li><div>hardware-assisted solution to mutual exclusion problem</div></li><li><div>atomic test and set of a memory value</div></li><li><div>spin until LOCK is acquired</div></li></ul><div><br/></div><div style="box-sizing: border-box; padding: 8px; font-family: Monaco, Menlo, Consolas, "Courier New", monospace; font-size: 12px; color: rgb(51, 51, 51); border-top-left-radius: 4px; border-top-right-radius: 4px; border-bottom-right-radius: 4px; border-bottom-left-radius: 4px; background-color: rgb(251, 250, 248); border: 1px solid rgba(0, 0, 0, 0.14902);-en-codeblock:true;"><div>enter_region:</div><div><span style="font-size: 12px; font-family: Monaco;">TSL REGISTER, LOCK | copy LOCK to register, set LOCK to 1</span></div><div><span style="font-size: 12px; font-family: Monaco;">CMP REGISTER, #0 | was LOCK zero?</span></div><div><span style="font-size: 12px; font-family: Monaco;">JNE ENTER_REGION | if it was non-zero LOCK was set, so loop</span></div><div><span style="font-size: 12px; font-family: Monaco;">RET | return to caller; critical region entered</span></div><div><br/></div><div><span style="font-size: 12px; font-family: Monaco;">leave_region:</span></div><div><span style="font-size: 12px; font-family: Monaco;">MOVE LOCK, #0 | store a 0 in LOCK</span></div><div><span style="font-size: 12px; font-family: Monaco;">RET | return to caller</span></div></div><div -/><div style="-en-paragraph:true;"><span style="-en-paragraph:true;"><span style="font-weight: bold;"><br/></span></span></div><div style="-en-paragraph:true;"><span style="-en-paragraph:true;"><span style="font-weight: bold;">Avoiding busy waiting</span></span></div><div>so far, CPU busy waits until it can enter the critical region (spin lock).</div><div>so, let process waiting return CPU to scheduler voluntarily.</div><div><br/></div><div style="box-sizing: border-box; padding: 8px; font-family: Monaco, Menlo, Consolas, "Courier New", monospace; font-size: 12px; color: rgb(51, 51, 51); border-top-left-radius: 4px; border-top-right-radius: 4px; border-bottom-right-radius: 4px; border-bottom-left-radius: 4px; background-color: rgb(251, 250, 248); border: 1px solid rgba(0, 0, 0, 0.14902);-en-codeblock:true;"><div>void sleep() {</div><div><span> set own state to BLOCKED;</span><br/></div><div><span><span> give CPU to scheduler;</span><br/></span></div><div><span><span>}</span></span></div><div><span><span><br/></span></span></div><div><span><span>void wakeup(process) {</span></span></div><div><span><span><span> set state of process to READY;</span><br/></span></span></div><div><span><span><span><span> give CPU to scheduler;</span><br/></span></span></span></div><div><span><span><span><span>}</span></span></span></span></div></div><div><br/></div><div>Producer-consumer</div><ul><li><div>producer sleeps when buff is full</div></li><li><div>consumer sleeps when buff is empty</div></li></ul><div style="margin-left: 40px;"><br/></div><div style="margin-left: 40px;"><img src="Race%20conditions%20&%20mutual%20exclusion.resources/screenshot_1.png" height="329" width="681"/></div><div style="margin-left: 40px;"><br/></div><ul><li><div>problem: wake up process may get lost. only wake up producer when count is N-1, but producer sleeps when count is N.</div></li></ul><div><br/></div><div>Semaphores</div><ul><li><div>introduce sema integer type with operations:</div></li><ul><li><div>down: if sema ≤ 0, block calling process. <font face="Courier New">sema--</font> otherwise.</div></li><li><div>up: if there is process blocking on sema, wake it up. sema++ otherwise.</div></li></ul><li><div>OS guarantees that all the operations are atomic (happening instantaneously) by design — disable interrupts on single processors, spin locking on multiprocessors.</div></li><li><div>mutex (mutual exclusion) variable serialises access to shared buffer</div></li></ul><div style="margin-left: 40px;"><br/></div><div style="margin-left: 40px;"><img src="Race%20conditions%20&%20mutual%20exclusion.resources/screenshot_2.png" height="563" width="998"/></div><div><br/></div><div>Monitors -</div><ul><li><div>semaphores introduce chaos in programs (gotta set all them bits)</div></li><li><div>monitors: serialise procedure calls on a module, use condition vars to wait/signal processes</div></li><li><div>but this needs dedicated language support</div></li></ul><div><br/></div><div style="margin-left: 40px;"><img src="Race%20conditions%20&%20mutual%20exclusion.resources/screenshot.png" height="521" width="925"/><br/></div><div style="-en-paragraph:true;"><span style="-en-paragraph:true;"><br/></span></div><div style="-en-paragraph:true;"><span style="-en-paragraph:true;"><span style="font-weight: bold;">Message passing: solution to process sync and communication</span></span></div><ul><li><div>processes interact by sending & receiving messages.</div></li><li><div>most common in multiserver OS designs</div></li><li><div>issues — mem copying vs register passing (efficiency), mailboxes vs rendezvous<br/></div></li></ul><div><br/></div></body></html>- \ No newline at end of file diff --git a/content/os-notes/Race conditions & mutual exclusion.resources/screenshot_1.png b/content/os-notes/Race conditions & mutual exclusion/70290486387a9647f1aacc196879d382.png Binary files differ. diff --git a/content/os-notes/Race conditions & mutual exclusion.resources/screenshot.png b/content/os-notes/Race conditions & mutual exclusion/8e61971770a4e43c57310ecaa0755f84.png Binary files differ. diff --git a/content/os-notes/Race conditions & mutual exclusion.resources/screenshot_2.png b/content/os-notes/Race conditions & mutual exclusion/d5e55212031521c6b0b19606037fe5d8.png Binary files differ. diff --git a/content/os-notes/Race conditions & mutual exclusion/index.md b/content/os-notes/Race conditions & mutual exclusion/index.md @@ -0,0 +1,109 @@ ++++ +title = 'Race conditions & mutual exclusion' ++++ +# Race conditions & mutual exclusion +programs can race each other and come to wrong conclusions. + +## Mutual exclusion +critical region: a code region with access to shared resources + +the conditions for mutual exclusion: +1. no two processes can be simultaneously in their critical regions +2. no assumptions can be made about speed/number of CPUs +3. no process running outside critical region may block others +4. no process should have to wait forever to enter critical region + +non-solutions + +- disable interrupts (disable reallocation of CPU). if you have multiple processors, shit gets fucked. +- lock variables (guard critical regions with booleans). so now the race is on the lock variables, great job dude. +- strict alternation (be nice and take turns). doesn’t allow processes to enter critical regions twice in a row, and a process outside the critical region can block another one. + +Peterson's algorithm + +- software solution +- spin lock in the while loop + +```c +#define N 2 +int turn; +int interested[N]; + +void enter_region(int process){ + int other = 1 - process; + interested[process] = TRUE; + turn = process; + while(turn==process && interested[other]==TRUE); +} + +void leave_region(int process) { + interested[process] = FALSE; +} +``` + +TSL instruction + +- hardware-assisted solution to mutual exclusion problem +- atomic test and set of a memory value +- spin until LOCK is acquired + +``` +enter_region: +TSL REGISTER, LOCK | copy LOCK to register, set LOCK to 1 +CMP REGISTER, #0 | was LOCK zero? +JNE ENTER_REGION | if it was non-zero LOCK was set, so loop +RET | return to caller; critical region entered + +leave_region: +MOVE LOCK, #0 | store a 0 in LOCK +RET | return to caller +``` + +## Avoiding busy waiting +so far, CPU busy waits until it can enter the critical region (spin lock). +so, let process waiting return CPU to scheduler voluntarily. + +```c +void sleep() { + set own state to BLOCKED; + give CPU to scheduler; +} + +void wakeup(process) { + set state of process to READY; + give CPU to scheduler; +} +``` + +Producer-consumer + +- producer sleeps when buff is full +- consumer sleeps when buff is empty + +![screenshot.png](70290486387a9647f1aacc196879d382.png) + +- problem: wake up process may get lost. only wake up producer when count is N-1, but producer sleeps when count is N. + +Semaphores + +- introduce sema integer type with operations: + - down: if sema ≤ 0, block calling process. sema-- otherwise. + - up: if there is process blocking on sema, wake it up. sema++ otherwise. +- OS guarantees that all the operations are atomic (happening instantaneously) by design — disable interrupts on single processors, spin locking on multiprocessors. +- mutex (mutual exclusion) variable serialises access to shared buffer + +![screenshot.png](d5e55212031521c6b0b19606037fe5d8.png) + +Monitors + +- semaphores introduce chaos in programs (gotta set all them bits) +- monitors: serialise procedure calls on a module, use condition vars to wait/signal processes +- but this needs dedicated language support + +![screenshot.png](8e61971770a4e43c57310ecaa0755f84.png) + +## Message passing: solution to process sync and communication + +- processes interact by sending & receiving messages. +- most common in multiserver OS designs +- issues — mem copying vs register passing (efficiency), mailboxes vs rendezvous diff --git a/content/os-notes/Reliability & Performance.html b/content/os-notes/Reliability & Performance.html @@ -1,17 +0,0 @@ -<?xml version="1.0" encoding="UTF-8"?> -<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> -<html><head><link rel="stylesheet" href="sitewide.css"><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/><meta name="exporter-version" content="Evernote Mac 7.6 (457297)"/><meta name="altitude" content="-0.7385642528533936"/><meta name="author" content="Alex Balgavy"/><meta name="created" content="2018-12-06 13:42:39 +0000"/><meta name="latitude" content="52.33451453990055"/><meta name="longitude" content="4.866780680819775"/><meta name="source" content="desktop.mac"/><meta name="updated" content="2018-12-19 00:04:32 +0000"/><title>Reliability & Performance</title></head><body><div style="margin-left: 40px;"/><div style=""><span style="font-weight: bold;">How to ensure reliability</span><br/></div><div>what are the threats? -</div><ul><li><div>disk failures: bad blocks, whole-disk errors</div></li><li><div>power failures: (meta)data inconsistently written to disk</div></li><li><div>software bugs: bad (meta)data written to disk</div></li><li><div>user errors: rm *.o vs rm * .o; dd if=/dev/zero of=zeros bs=1M # fill disk quota</div></li></ul><div>backups: incremental vs full, online vs offline, physical vs logical (on filesystem level), compressed vs uncompressed, local vs remote</div><div><br/></div><div>RAID: redundant array of independent (originally inexpensive) disks -</div><ul><li><div>virtualise addressing on top of multiple disks (as single address space)</div></li><li><div>RAID control operates just like MMU in memory</div></li><li><div>options: -</div></li><ul><li><div>mirroring (RAID 1) -- no real slowdown or advantage for writing. but reading can be done in parallel from two different disks.</div></li><li><div>striping (RAID 0) -- scatter accross disks. no reliability benefits, but very good performance.</div></li><li><div>hybrid -- first few you stripe. the last disk, you store parity bits.</div></li></ul><li><div><img src="Reliability%20&%20Performance.resources/5D80EC7E-4975-43D9-9A86-4B7659D0DB1F.png" height="536" width="1038"/></div></li><li><div><a href="https://en.wikipedia.org/wiki/Nested_RAID_levels">Wikipedia page</a></div></li></ul><div>fsck (File System Consistency Check) -</div><ul><li><div>you need invariants. so exploit redundancy in existing filesystems.</div></li><li><div><img src="Reliability%20&%20Performance.resources/6D574623-C739-4747-BA19-0F9A010FB0A0.png" height="566" width="1014"/></div></li></ul><div><br/></div><div><span style="font-weight: bold;">Improve filesystem performance: -</span></div><div>minimize disk access: -</div><ul><li><div>caching: buffer cache, inode cache (literally cache of inodes stored in memory), direntry cache (for e.g. path name lookups) -</div></li><ul><li><div>buffer cache: -</div></li><ul><li><div>build list recently used queue. end is most recently used, front is least recently used.</div></li><li><div>periodically evict from front. hash table pointing to indicies (don't have to go through whole list to search)</div></li><li><div>write-through caching (if doing write on block, will do on cache, and immediately persist on disk) vs. periodic syncing (periodically write back blocks in buffer cache, typically with daemon)</div></li><li><div><img src="Reliability%20&%20Performance.resources/DD35C5DC-89B7-4AE4-9868-1514F6EC5DD3.png" height="244" width="673"/></div></li></ul></ul><li><div>block read ahead (anticipate access patterns</div></li></ul><div>minimize seek time (stay in the same section of memory more or less): -</div><ul><li><div>try to alloc files contiguously</div></li><li><div>spread i-nodes over disk</div></li><li><div>store small file data 'inline' in i-node (as metadata kind of)</div></li><li><div>defragment disk</div></li></ul><div><br/></div><div><span style="font-weight: bold;">Different file system options: -</span></div><div>log-structured filesystems: -</div><ul><li><div>optimise for frequent small writes</div></li><li><div>collect pending writes in log segment, flush to disk sequentially</div></li><li><div>segment can contain anything (inodes, dir entries, blocks, whatever) and can be e.g. 1 MB in size</div></li><li><div>relies on inode index to find inodes in log efficiently</div></li><li><div>garbage collection to reclaim stale log entries</div></li></ul><div>journaling filesystems: -</div><ul><li><div>use 'logs' for crash recovery</div></li><li><div>first write transactional operations in log: -</div></li><ul><li><div>remove file from its dir</div></li><li><div>release inode to pool of free inodes</div></li><li><div>return all disk blocks to pool of free disk blocks</div></li></ul><li><div>after crash, replay operations from log</div></li><li><div>requires single operations to be <span style="font-style: italic;">idempotent</span></div></li><li><div>should support multiple, arbitrary crashes</div></li><li><div>journaling is widely used in modern filesystems</div></li></ul><div>virtual filesystems (VFS): -</div><ul><li><div><img src="Reliability%20&%20Performance.resources/608B7B6B-85B9-47E0-83EF-D0D90872D0B9.png" height="321" width="591"/></div></li><li><div><img src="Reliability%20&%20Performance.resources/92AACEFD-6BC6-474F-B02D-D4B991CF50B6.png" height="382" width="510"/></div></li></ul><div><br/></div></body></html>- \ No newline at end of file diff --git a/content/os-notes/Reliability & Performance.resources/6D574623-C739-4747-BA19-0F9A010FB0A0.png b/content/os-notes/Reliability & Performance/9f1775d17b641473033931c7009a2fa0.png Binary files differ. diff --git a/content/os-notes/Reliability & Performance.resources/5D80EC7E-4975-43D9-9A86-4B7659D0DB1F.png b/content/os-notes/Reliability & Performance/a7298bb639635540af0873ab67b18f2c.png Binary files differ. diff --git a/content/os-notes/Reliability & Performance.resources/DD35C5DC-89B7-4AE4-9868-1514F6EC5DD3.png b/content/os-notes/Reliability & Performance/b46b5bb2c4b52cf4be0ce975af65fb60.png Binary files differ. diff --git a/content/os-notes/Reliability & Performance.resources/608B7B6B-85B9-47E0-83EF-D0D90872D0B9.png b/content/os-notes/Reliability & Performance/bf1de095db2a876b4fc51249dbeff88f.png Binary files differ. diff --git a/content/os-notes/Reliability & Performance.resources/92AACEFD-6BC6-474F-B02D-D4B991CF50B6.png b/content/os-notes/Reliability & Performance/c306c2cc8e85ffa6074cac359f53d93a.png Binary files differ. diff --git a/content/os-notes/Reliability & Performance/index.md b/content/os-notes/Reliability & Performance/index.md @@ -0,0 +1,73 @@ ++++ +title = 'Reliability & Performance' ++++ +# Reliability & Performance +## How to ensure reliability +what are the threats? + +- disk failures: bad blocks, whole-disk errors +- power failures: (meta)data inconsistently written to disk +- software bugs: bad (meta)data written to disk +- user errors: `rm *.o` vs `rm * .o`; `dd if=/dev/zero of=zeros bs=1M # fill disk quota` + +backups: incremental vs full, online vs offline, physical vs logical (on filesystem level), compressed vs uncompressed, local vs remote + +RAID: redundant array of independent (originally inexpensive) disks + +- virtualise addressing on top of multiple disks (as single address space) +- RAID control operates just like MMU in memory +- options: + - mirroring (RAID 1) -- no real slowdown or advantage for writing. but reading can be done in parallel from two different disks. + - striping (RAID 0) -- scatter accross disks. no reliability benefits, but very good performance. + - hybrid -- first few you stripe. the last disk, you store parity bits. +- ![](a7298bb639635540af0873ab67b18f2c.png) +- [Wikipedia page](https://en.wikipedia.org/wiki/Nested_RAID_levels) + +fsck (File System Consistency Check) + +- you need invariants. so exploit redundancy in existing filesystems. +- ![](9f1775d17b641473033931c7009a2fa0.png) + +## Improve filesystem performance: +minimize disk access: + +- caching: buffer cache, inode cache (literally cache of inodes stored in memory), direntry cache (for e.g. path name lookups) + - buffer cache: + - build list recently used queue. end is most recently used, front is least recently used. + - periodically evict from front. hash table pointing to indicies (don't have to go through whole list to search) + - write-through caching (if doing write on block, will do on cache, and immediately persist on disk) vs. periodic syncing (periodically write back blocks in buffer cache, typically with daemon) + - ![](b46b5bb2c4b52cf4be0ce975af65fb60.png) +- block read ahead (anticipate access patterns + +minimize seek time (stay in the same section of memory more or less): + +- try to alloc files contiguously +- spread i-nodes over disk +- store small file data 'inline' in i-node (as metadata kind of) +- defragment disk + +## Different file system options: +log-structured filesystems: + +- optimise for frequent small writes +- collect pending writes in log segment, flush to disk sequentially +- segment can contain anything (inodes, dir entries, blocks, whatever) and can be e.g. 1 MB in size +- relies on inode index to find inodes in log efficiently +- garbage collection to reclaim stale log entries + +journaling filesystems: + +- use 'logs' for crash recovery +- first write transactional operations in log: + - remove file from its dir + - release inode to pool of free inodes + - return all disk blocks to pool of free disk blocks +- after crash, replay operations from log +- requires single operations to be *idempotent* +- should support multiple, arbitrary crashes +- journaling is widely used in modern filesystems + +virtual filesystems (VFS): + +- ![](bf1de095db2a876b4fc51249dbeff88f.png) +- ![](c306c2cc8e85ffa6074cac359f53d93a.png) diff --git a/content/os-notes/Scheduling.html b/content/os-notes/Scheduling.html @@ -1,16 +0,0 @@ -<?xml version="1.0" encoding="UTF-8"?> -<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> -<html><head><link rel="stylesheet" href="sitewide.css"><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/><meta name="exporter-version" content="Evernote Mac 7.6 (457297)"/><meta name="altitude" content="-1.633376002311707"/><meta name="author" content="Alex Balgavy"/><meta name="created" content="2018-12-03 16:08:06 +0000"/><meta name="latitude" content="52.33331298828125"/><meta name="longitude" content="4.866615317820796"/><meta name="source" content="desktop.mac"/><meta name="updated" content="2018-12-03 16:08:37 +0000"/><title>Scheduling</title></head><body><div><img src="Scheduling.resources/09C0D52D-0C2A-4E90-BF83-3EA6ACFC8CAC.png" height="309" width="460"/><br/></div><div style="margin-top: 1em;margin-bottom: 1em;-en-paragraph:true;"><span style="-en-paragraph:true;">if more processes ready than CPUs available:</span></div><ul><li><div>scheduler decides which process to run next</div></li><li><div>algorithm used by scheduler is called scheduling algorithm</div></li></ul><div style="margin-top: 1em;margin-bottom: 1em;-en-paragraph:true;"><span style="-en-paragraph:true;">when to schedule?</span></div><ul><li><div>process exits</div></li><li><div>process blocks on IO or semaphore</div></li><li><div>when new process is created</div></li><li><div>when IO interrupt occurs</div></li><li><div>when clock interrupt occurs</div></li></ul><div style="margin-top: 1em;margin-bottom: 1em;-en-paragraph:true;"><span style="-en-paragraph:true;">scheduling #goals</span></div><ul><li><div>different goals for batch, interactive, or real time systems</div></li><li><div>for all systems: -</div></li><ul><li><div>fairness: giving each process a fair share of CPU</div></li><li><div>policy reinforcement: carrying out stated policy</div></li><li><div>balance: keeping all parts of system busy</div></li><li><div>throughput: maximise jobs per hour</div></li><li><div>turaround time: minimise time between submission and termination</div></li><li><div>CPU utilisation: keepin CPU busy <span style="font-style: italic;">all the time</span></div></li></ul></ul><div style="margin-top: 1em;margin-bottom: 1em;-en-paragraph:true;"><span style="-en-paragraph:true;">Batch scheduling algorithms:</span></div><ul><li><div>First-Come First-Served (FIFO) -</div></li><ul><li><div>process jobs in order of arrival</div></li><li><div>non-preemptive</div></li><li><div>single process Q: -</div></li><ul><li><div>new jobs or blocking processes are added to end of Q</div></li></ul><li><div>can lead to "convoy effect" if only few CPU bound and many IO bound processes</div></li></ul><li><div>Shortest job first: -</div></li><ul><li><div>pick job with the shortest run time</div></li><li><div>provably optimal: lowest turnaround time</div></li><li><div>of course, runtimes have to be known in advance</div></li><li><div>may lead to starvation, if a lot of short-lived processes arrive then a long run-time process will never run</div></li><li><div>highest-response-ratio-next -- improved version of shortestjob first</div></li></ul></ul><div style="margin-top: 1em;margin-bottom: 1em;-en-paragraph:true;"><span style="-en-paragraph:true;">Interactive scheduling:</span></div><ul><li><div>response time -- respond to requests quickly</div></li><li><div>proportionality -- meet users' expectations</div></li><li><div>algorithms: -</div></li><ul><li><div>round robin scheduling: -</div></li><ul><li><div>preemptive algorithm</div></li><li><div>each process gets a time slice (quantum)</div></li><li><div>if process is still running at end of quantum, it gets preemted & goes to end of ready Q</div></li><li><div>how big should that quantum be? depends on the implementation (lol when does it not depend) -</div></li><ul><li><div>of course, you don't want most of the time to be context switching</div></li></ul></ul><li><div>priority scheduling -</div></li><ul><li><div>similar to round robin, but several ready Qs</div></li><li><div><img src="Scheduling.resources/643D34DE-D587-463D-86B8-F9A35F934198.png" height="168" width="407"/></div></li><li><div>next process is picked from Q with highest priority</div></li><li><div>static vs dynamic -</div></li><ul><li><div>static priority: there is a single unchanging priority. often used as base priority level.</div></li><li><div>dynamic priority: OS cleverly decides which processes should have higher priorities (e.g. if IO bound, should have higher priority than CPU bound)</div></li></ul><li><div>but can lead to priority inversion (e.g. Pathfinder). solutions: -</div></li><ul><li><div>priority ceiling protocol: mutex gets priority assigned of the highest priority process that might lock the mutext</div></li><li><div>priority inheritance protocol: high priority process blocks because low priority process hold mutex -> low priority process 'inherits' priority of blocked process</div></li><li><div>random boosting: ready processes holding mutexes are temporarily (and randomly) boosted in priorities</div></li></ul><li><div>how to minimise response time for each priority Q? shortest process next: -</div></li><ul><li><div>use shortest job first and try to best predict next running time</div></li><li><div>form weighted average of previous running times of processes</div></li></ul></ul><li><div>guaranteed scheduling -</div></li><ul><li><div>N processes running -> each process gets 1/Nth of CPU time (aka fair-share)</div></li><li><div>calc how much CPU time process might have gotten (time since process creation divided by N)</div></li><li><div>measure actual consumed CPU time</div></li><li><div>form ratio (e.g. 0.5 means process running for half the time it was entitled to)</div></li><li><div>pick process with smallest ratio to run next</div></li></ul><li><div>lottery scheduling</div></li><ul><li><div>processes get lottery tickets</div></li><li><div>whenever scheduling decision has to be made, OS chooses winning ticket randomly</div></li><li><div>processes can have multiple tickets & priorities</div></li><li><div>tickets can be traded between processes</div></li></ul></ul></ul><div style="margin-top: 1em;margin-bottom: 1em;-en-paragraph:true;"><span style="-en-paragraph:true;">real time systems</span></div><ul><li><div>main concerns: -</div></li><ul><li><div>meeting deadlines - avoid using data</div></li><li><div>predictability - avoid quality degradation in multimedia systems</div></li></ul><li><div>soft real time vs hard real time (cannot miss any deadlines)</div></li><li><div>can consist of periodic and aperiodic tasks</div></li><li><div>schedules can be static (schedules are known in advance) or dynamic (make scheduling decisions during execution)</div></li><li><div>system with periodic tasks is schedulable when we can meet the deadlines</div></li></ul><div style="margin-left: 40px;"><img src="Scheduling.resources/AC15C36D-9C69-4389-B190-480FA7BACF45.png" height="455" width="984"/><br/></div><div><br/></div></body></html>- \ No newline at end of file diff --git a/content/os-notes/Scheduling.resources/AC15C36D-9C69-4389-B190-480FA7BACF45.png b/content/os-notes/Scheduling/398c403baa8dc5f07716b89ffffeb916.png Binary files differ. diff --git a/content/os-notes/Scheduling.resources/643D34DE-D587-463D-86B8-F9A35F934198.png b/content/os-notes/Scheduling/40813826aa1c77152c749b7af5e45803.png Binary files differ. diff --git a/content/os-notes/Scheduling.resources/09C0D52D-0C2A-4E90-BF83-3EA6ACFC8CAC.png b/content/os-notes/Scheduling/f3c4b201c6a3c89bd40bba1da9eedef0.png Binary files differ. diff --git a/content/os-notes/Scheduling/index.md b/content/os-notes/Scheduling/index.md @@ -0,0 +1,92 @@ ++++ +title = 'Scheduling' ++++ +# Scheduling +![](f3c4b201c6a3c89bd40bba1da9eedef0.png) +if more processes ready than CPUs available: + +- scheduler decides which process to run next +- algorithm used by scheduler is called scheduling algorithm + +when to schedule? + +- process exits +- process blocks on IO or semaphore +- when new process is created +- when IO interrupt occurs +- when clock interrupt occurs + +scheduling #goals + +- different goals for batch, interactive, or real time systems +- for all systems: + - fairness: giving each process a fair share of CPU + - policy reinforcement: carrying out stated policy + - balance: keeping all parts of system busy + - throughput: maximise jobs per hour + - turaround time: minimise time between submission and termination + - CPU utilisation: keepin CPU busy *all the time* + +Batch scheduling algorithms: + +- First-Come First-Served (FIFO) + - process jobs in order of arrival + - non-preemptive + - single process Q: + - new jobs or blocking processes are added to end of Q + - can lead to "convoy effect" if only few CPU bound and many IO bound processes +- Shortest job first: + - pick job with the shortest run time + - provably optimal: lowest turnaround time + - of course, runtimes have to be known in advance + - may lead to starvation, if a lot of short-lived processes arrive then a long run-time process will never run + - highest-response-ratio-next -- improved version of shortestjob first + +Interactive scheduling: + +- response time -- respond to requests quickly +- proportionality -- meet users' expectations +- algorithms: + - round robin scheduling: + - preemptive algorithm + - each process gets a time slice (quantum) + - if process is still running at end of quantum, it gets preemted & goes to end of ready Q + - how big should that quantum be? depends on the implementation (lol when does it not depend) + - of course, you don't want most of the time to be context switching + - priority scheduling + - similar to round robin, but several ready Qs + - ![](40813826aa1c77152c749b7af5e45803.png) + - next process is picked from Q with highest priority + - static vs dynamic + - static priority: there is a single unchanging priority. often used as base priority level. + - dynamic priority: OS cleverly decides which processes should have higher priorities (e.g. if IO bound, should have higher priority than CPU bound) + - but can lead to priority inversion (e.g. Pathfinder). solutions: + - priority ceiling protocol: mutex gets priority assigned of the highest priority process that might lock the mutext + - priority inheritance protocol: high priority process blocks because low priority process hold mutex -> low priority process 'inherits' priority of blocked process + - random boosting: ready processes holding mutexes are temporarily (and randomly) boosted in priorities + - how to minimise response time for each priority Q? shortest process next: + - use shortest job first and try to best predict next running time + - form weighted average of previous running times of processes + - guaranteed scheduling + - N processes running -> each process gets 1/Nth of CPU time (aka fair-share) + - calc how much CPU time process might have gotten (time since process creation divided by N) + - measure actual consumed CPU time + - form ratio (e.g. 0.5 means process running for half the time it was entitled to) + - pick process with smallest ratio to run next + - lottery scheduling + - processes get lottery tickets + - whenever scheduling decision has to be made, OS chooses winning ticket randomly + - processes can have multiple tickets & priorities + - tickets can be traded between processes + +real time systems + +- main concerns: + - meeting deadlines - avoid using data + - predictability - avoid quality degradation in multimedia systems +- soft real time vs hard real time (cannot miss any deadlines) +- can consist of periodic and aperiodic tasks +- schedules can be static (schedules are known in advance) or dynamic (make scheduling decisions during execution) +- system with periodic tasks is schedulable when we can meet the deadlines + +![](398c403baa8dc5f07716b89ffffeb916.png) diff --git a/content/os-notes/Signal handling.html b/content/os-notes/Signal handling.html @@ -1,3 +0,0 @@ -<?xml version="1.0" encoding="UTF-8"?> -<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> -<html><head><link rel="stylesheet" href="sitewide.css"><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/><meta name="exporter-version" content="Evernote Mac 7.6 (457297)"/><meta name="altitude" content="-1.633376002311707"/><meta name="author" content="Alex Balgavy"/><meta name="created" content="2018-12-03 16:04:05 +0000"/><meta name="latitude" content="52.33331298828125"/><meta name="longitude" content="4.866615317820796"/><meta name="source" content="desktop.mac"/><meta name="updated" content="2018-12-14 15:38:32 +0000"/><title>Signal handling</title></head><body><div>types:</div><ul><li><div>hardware-induced (e.g SIGILL)<br/></div></li><li><div>software-induced (e.g SIGQUIT, SIGPIPE)</div></li></ul><div style="-en-paragraph:true;"><span style="-en-paragraph:true;"><br/></span></div><div style="-en-paragraph:true;"><span style="-en-paragraph:true;">actions (responses):</span></div><ul><li><div>Term (terminate), Ign (ignore), Core (terminate execution & OS core dump), Stop (stop execution, freeze process), Cont (continue, unfreeze processs)</div></li><li><div>Default action on per-signal basis, which is typically overridable (but some aren't, like SIGKILL)</div></li><li><div>signals can typically be blocked and actions delayed (but again, some exceptions)</div></li></ul><div style="-en-paragraph:true;"><span style="-en-paragraph:true;"><br/></span></div><div style="-en-paragraph:true;"><span style="-en-paragraph:true;">catching signals:</span></div><ul><li><div>process registers signal handler</div></li><li><div>OS delivers signal, allows process to run handler</div></li><li><div>current execution context has to be saved/restored</div></li></ul><div><br/></div><div>diagram example:</div><div><br/></div><div style="-en-paragraph:true;"><img src="Signal%20handling.resources/1DC10193-BD5C-495D-BA4E-2D5E9A9BF185.png" height="755" width="816"/></div><div><br/></div></body></html>- \ No newline at end of file diff --git a/content/os-notes/Signal handling.resources/1DC10193-BD5C-495D-BA4E-2D5E9A9BF185.png b/content/os-notes/Signal handling/039ab127aa97667446c4b511855273ca.png Binary files differ. diff --git a/content/os-notes/Signal handling/index.md b/content/os-notes/Signal handling/index.md @@ -0,0 +1,24 @@ ++++ +title = 'Signal handling' ++++ +# Signal handling +types: + +- hardware-induced (e.g SIGILL) +- software-induced (e.g SIGQUIT, SIGPIPE) + +actions (responses): + +- Term (terminate), Ign (ignore), Core (terminate execution & OS core dump), Stop (stop execution, freeze process), Cont (continue, unfreeze processs) +- Default action on per-signal basis, which is typically overridable (but some aren't, like SIGKILL) +- signals can typically be blocked and actions delayed (but again, some exceptions) + +catching signals: + +- process registers signal handler +- OS delivers signal, allows process to run handler +- current execution context has to be saved/restored + +diagram example: + +![](039ab127aa97667446c4b511855273ca.png) diff --git a/content/os-notes/Summary Operating Systems.pdf b/content/os-notes/Summary Operating Systems.pdf Binary files differ. diff --git a/content/os-notes/System calls.html b/content/os-notes/System calls.html @@ -1,8 +0,0 @@ -<?xml version="1.0" encoding="UTF-8"?> -<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> -<html><head><link rel="stylesheet" href="sitewide.css"><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/><meta name="exporter-version" content="Evernote Mac 7.6 (457297)"/><meta name="altitude" content="-1.414480566978455"/><meta name="author" content="Alex Balgavy"/><meta name="created" content="2018-11-08 08:11:28 +0000"/><meta name="latitude" content="52.33346557617188"/><meta name="longitude" content="4.866784463121516"/><meta name="source" content="desktop.mac"/><meta name="updated" content="2018-11-09 12:35:16 +0000"/><title>System calls</title></head><body><ul><li><div>every process starts with 3 files open: stdin, stdout, stderr</div></li><li><div>steps:</div></li></ul><div><img src="System%20calls.resources/96F6E180-A92F-461F-B610-A089729C2D01.png" height="745" width="959"/></div><ul><li><div>what has to happen to print hello world to stdout?</div></li><ul><li><div>build process:</div></li></ul><div style="margin-left: 40px;"><img src="System%20calls.resources/67E58FA0-18D7-41DC-95D6-341B68F15454.png" height="615" width="546"/></div><ul><li><div>iteration 1 -</div></li></ul></ul><div><br/></div><div style="box-sizing: border-box; padding: 8px; font-family: Monaco, Menlo, Consolas, "Courier New", monospace; font-size: 12px; color: rgb(51, 51, 51); border-top-left-radius: 4px; border-top-right-radius: 4px; border-bottom-right-radius: 4px; border-bottom-left-radius: 4px; background-color: rgb(251, 250, 248); border: 1px solid rgba(0, 0, 0, 0.14902);-en-codeblock:true;"><div>#include <stdio.h></div><div> int main(int argc, char **argv) {</div><div> printf("Hello World!\n");</div><div> return 0; </div><div> }</div></div><div><br/></div><ul><ul><li><div>iteration 2 -</div><div><br/></div></li></ul></ul><div style="box-sizing: border-box; padding: 8px; font-family: Monaco, Menlo, Consolas, "Courier New", monospace; font-size: 12px; color: rgb(51, 51, 51); border-top-left-radius: 4px; border-top-right-radius: 4px; border-bottom-right-radius: 4px; border-bottom-left-radius: 4px; background-color: rgb(251, 250, 248); border: 1px solid rgba(0, 0, 0, 0.14902);-en-codeblock:true;"><div>#include <unistd.h></div><div>#define STDOUT 1</div><div>int main(int argc, char **argv) {</div><div><span> </span>char msg[] = "Hello World!\n";</div><div><span> </span>write(STDOUT, msg, sizeof(msg));</div><div> return 0;</div><div>}</div></div><div><br/></div><ul><ul><li><div>iteration 3 -</div><div><br/></div></li></ul></ul><div style="box-sizing: border-box; padding: 8px; font-family: Monaco, Menlo, Consolas, "Courier New", monospace; font-size: 12px; color: rgb(51, 51, 51); border-top-left-radius: 4px; border-top-right-radius: 4px; border-bottom-right-radius: 4px; border-bottom-left-radius: 4px; background-color: rgb(251, 250, 248); border: 1px solid rgba(0, 0, 0, 0.14902);-en-codeblock:true;"><div>#define _GNU_SOURCE</div><div>#include <sys/syscall.h></div><div>#define STDOUT 1</div><div>int main(int argc, char **argv) {</div><div><span> </span>char msg[] = "Hello World!\n”;</div><div><span> </span>int nr = SYS_write;</div><div><span> </span>syscall(nr, STDOUT, msg, sizeof(msg));</div><div><span> </span>return 0;</div><div>}</div></div><div><br/></div><ul><li><div>syscall diagram</div></li></ul><div style="margin-left: 40px;"><img src="System%20calls.resources/F197EB04-48B5-4FB8-9784-9576A5E5A442.png" height="526" width="625"/></div><ul><li><div>syscall (x86 Linux) is triggered by instruction (like 0x80):</div></li><ul><li><div>privilege level changed to kernel mode</div></li><li><div>program counter set to specific location</div></li><li><div>arguments passed in registers: -</div></li><ul><li><div>rax <- syscall number</div></li><li><div>ebx, ecdx, edx, esi, edi, ebp <- arguments</div></li><li><div>stack <- more arguments</div></li></ul><li><div>x86-64 supports legacy int 0x80, new instruction syscall -</div></li><ul><li><div>rax <- syscall number (different from 32bit)</div></li><li><div>rdi, rsi, rdx, r10, r8, r9 <- arguments</div></li></ul></ul><li><div>hello world without glibc -- manual system calls, in-line assembly:</div></li></ul><div><br/></div><div style="box-sizing: border-box; padding: 8px; font-family: Monaco, Menlo, Consolas, "Courier New", monospace; font-size: 12px; color: rgb(51, 51, 51); border-top-left-radius: 4px; border-top-right-radius: 4px; border-bottom-right-radius: 4px; border-bottom-left-radius: 4px; background-color: rgb(251, 250, 248); border: 1px solid rgba(0, 0, 0, 0.14902);-en-codeblock:true;"><div>ssize_t write(int fd, const void *buf, size_t nbytes) {</div><div> ssize_t ret;</div><div> asm volatile</div><div> (</div><div><span> <span> <span> <span> </span></span></span></span>/* request syscall to OS (can also be ‘int $0x80’) */</div><div> “syscall”</div><div> </div><div><span> <span> <span> <span> /* return result in %eax */</span></span></span></span><br/></div><div><span><span><span/></span></span><span> <span> <span> <span> </span></span></span></span>: "=a" (ret)</div><div> </div><div><span> <span> <span> <span> </span></span></span></span>/* __NR_write (1) into same place as operand 0, fd into %rdi, buffer into %rsi, length into %rdx */</div><div><span> <span> <span> <span> </span></span></span></span>: "0" (__NR_write), "D"(fd), "S"(buf), "d"(nbytes)</div><div><br/></div><div><span> <span> <span> <span> /* modified cc, registers %rcx and %r11, and memory */</span></span></span></span><br/></div><div> : "cc", "rcx", "r11", "memory"</div><div> );</div><div> return ret;</div><div> }</div></div><div><br/></div><ul><ul><li><div>actual objdump of this program</div></li></ul></ul><div><img src="System%20calls.resources/5724BA52-D9D9-4370-BDCC-004143484C9C.png" height="288" width="582"/></div><div><br/></div></body></html>- \ No newline at end of file diff --git a/content/os-notes/System calls.resources/96F6E180-A92F-461F-B610-A089729C2D01.png b/content/os-notes/System calls/3008f702882cb6bfcab5abd16d53719a.png Binary files differ. diff --git a/content/os-notes/System calls.resources/67E58FA0-18D7-41DC-95D6-341B68F15454.png b/content/os-notes/System calls/5b26d2fb6045ae318661888ee0febe89.png Binary files differ. diff --git a/content/os-notes/System calls.resources/F197EB04-48B5-4FB8-9784-9576A5E5A442.png b/content/os-notes/System calls/7398350cc1064e43b0aa996f2fa2b199.png Binary files differ. diff --git a/content/os-notes/System calls.resources/5724BA52-D9D9-4370-BDCC-004143484C9C.png b/content/os-notes/System calls/d9c7a810130c9f8fd2152d4c2c1e48b5.png Binary files differ. diff --git a/content/os-notes/System calls/index.md b/content/os-notes/System calls/index.md @@ -0,0 +1,89 @@ ++++ +title = 'System calls' ++++ +# System calls + +- every process starts with 3 files open: stdin, stdout, stderr +- steps: + +![](3008f702882cb6bfcab5abd16d53719a.png) + +- what has to happen to print hello world to stdout? + - build process: + + ![](5b26d2fb6045ae318661888ee0febe89.png) + + - iteration 1 + + ```c + #include <stdio.h> + int main(int argc, char **argv) { + printf("Hello World!\n"); + return 0; + } + ``` + + - iteration 2 + + ```c + #include <unistd.h> + #define STDOUT 1 + int main(int argc, char **argv) { + char msg[] = "Hello World!\n"; + write(STDOUT, msg, sizeof(msg)); + return 0; + } + ``` + + - iteration 3 + ```c + #define _GNU_SOURCE + #include <sys/syscall.h> + #define STDOUT 1 + int main(int argc, char **argv) { + char msg[] = "Hello World!\n”; + int nr = SYS_write; + syscall(nr, STDOUT, msg, sizeof(msg)); + return 0; + } + ``` +- syscall diagram + + ![](7398350cc1064e43b0aa996f2fa2b199.png) + +- syscall (x86 Linux) is triggered by instruction (like 0x80): + - privilege level changed to kernel mode + - program counter set to specific location + - arguments passed in registers: + - rax <- syscall number + - ebx, ecdx, edx, esi, edi, ebp <- arguments + - stack <- more arguments + - x86-64 supports legacy int 0x80, new instruction syscall + - rax <- syscall number (different from 32bit) + - rdi, rsi, rdx, r10, r8, r9 <- arguments +- hello world without glibc -- manual system calls, in-line assembly: + + ```c + ssize_t write(int fd, const void *buf, size_t nbytes) { + ssize_t ret; + asm volatile + ( + /* request syscall to OS (can also be ‘int $0x80’) */ + “syscall” + + /* return result in %eax */ + : "=a" (ret) + + /* __NR_write (1) into same place as operand 0, fd into %rdi, buffer into %rsi, length into %rdx */ + + : "0" (__NR_write), "D"(fd), "S"(buf), "d"(nbytes) + + /* modified cc, registers %rcx and %r11, and memory */ + : "cc", "rcx", "r11", "memory" + ); + return ret; + } + ``` + - actual objdump of this program + + ![](d9c7a810130c9f8fd2152d4c2c1e48b5.png) diff --git a/content/os-notes/The Memory Address Space.html b/content/os-notes/The Memory Address Space.html @@ -1,6 +0,0 @@ -<?xml version="1.0" encoding="UTF-8"?> -<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> -<html><head><link rel="stylesheet" href="sitewide.css"><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/><meta name="exporter-version" content="Evernote Mac 7.6 (457297)"/><meta name="altitude" content="-4.208047389984131"/><meta name="author" content="Alex Balgavy"/><meta name="created" content="2018-12-18 23:31:44 +0000"/><meta name="latitude" content="52.30032348632812"/><meta name="longitude" content="4.988162358630428"/><meta name="source" content="desktop.mac"/><meta name="updated" content="2018-12-18 23:32:00 +0000"/><title>The Memory Address Space</title></head><body><div style="-en-paragraph:true;"><span style="-en-paragraph:true;">introduce abstraction of address space - every program runs in its own address space</span></div><ul><li><div>base register operates dynamic relocation, controls where address space starts</div></li><li><div>limit register decides maximum address in physical memory that you can access</div></li></ul><div><br/></div><div>how memory works with many programs:</div><ul><li><div>processes compete for memory partitions</div></li><li><div>but how do you know the size of partitions? use dynamic partitions and swapping</div><div><br/></div><div><img src="The%20Memory%20Address%20Space.resources/screenshot.png" height="264" width="590"/></div></li></ul><div><br/></div><ul><ul><li><div>problem: swapping may lead to memory fragmentation (when you have a many separate small chunks of free memory)</div></li><ul><li><div>solution: memory compaction</div></li></ul><li><div>problem - need to allow extra room for process growth</div></li><ul><li><div>how much extra room? memory usage vs out-of-memory (OOM) risk</div></li><li><div>what do when OOM (out of mana/memory)? -</div></li><ul><li><div>kill process</div></li><li><div>relocate process</div></li><li><div>swap out</div></li></ul></ul></ul></ul><div><br/></div><div>memory management -</div><ul><li><div>which part of memory is allocated?</div></li><ul><li><div>divide memory in blocks (e.g. each block is 4 bytes)</div></li><li><div>keep track of them using:</div></li><ul><li><div>bitmap</div></li><ul><li><div>divide memory in blocks. every bit in bitmap corresponds to a byte in memory. bit is 1 when the memory is allocated, 0 when it's free.</div></li><li><div>in bitmap, allocation is super fast. but to find free memory, you need to slowly scan for a hole.</div></li></ul><li><div>linked list of unallocated memory</div></li><ul><li><div>allocation is slow af, so is deallocation</div></li><li><div>holes sorted by address for fast coalescing</div></li></ul></ul></ul><li><div>how do you allocate?</div></li><ul><li><div>first fit: take first fitting hole (MINIX 3). simplest option.</div></li><li><div>next fit: take next fitting hole. slower than first fit in practice.</div></li><li><div>best fit: take best fitting hole. prone to fragmentation.</div></li><li><div>worst fit: take worst fitting hole. poor performance in practice.</div></li><li><div>quick fit: keep holes of different sizes. poor coalescing performance.</div></li><li><div>buddy allocation scheme: improve quick fit's coalescing performance (Linux uses this). -</div></li><ul><li><div>originally - area of 64 "chunks" (a unit chunk is whatever you want, on UNIX it's 4 kb)</div></li><li><div>maintain <span style="font-style: italic;">n</span> lists, one per size class</div></li><li><div>you split on powers of 2 until you get the right size</div></li><li><div>how do you find your buddies? shout eyyyyy buddy</div></li></ul></ul></ul><div/></body></html>- \ No newline at end of file diff --git a/content/os-notes/The Memory Address Space.resources/screenshot.png b/content/os-notes/The Memory Address Space/3c804f4ae4f360f7be9a2b2d47f73b67.png Binary files differ. diff --git a/content/os-notes/The Memory Address Space/index.md b/content/os-notes/The Memory Address Space/index.md @@ -0,0 +1,47 @@ ++++ +title = 'The Memory Address Space' ++++ +# The Memory Address Space +introduce abstraction of address space - every program runs in its own address space + +- base register operates dynamic relocation, controls where address space starts +- limit register decides maximum address in physical memory that you can access + +how memory works with many programs: + +- processes compete for memory partitions +- but how do you know the size of partitions? use dynamic partitions and swapping + + ![screenshot.png](3c804f4ae4f360f7be9a2b2d47f73b67.png) + + - problem: swapping may lead to memory fragmentation (when you have a many separate small chunks of free memory) + - solution: memory compaction + - problem - need to allow extra room for process growth + - how much extra room? memory usage vs out-of-memory (OOM) risk + - what do when OOM (out of mana/memory)? + - kill process + - relocate process + - swap out + +memory management + +- which part of memory is allocated? + - divide memory in blocks (e.g. each block is 4 bytes) + - keep track of them using: + - bitmap + - divide memory in blocks. every bit in bitmap corresponds to a byte in memory. bit is 1 when the memory is allocated, 0 when it's free. + - in bitmap, allocation is super fast. but to find free memory, you need to slowly scan for a hole. + - linked list of unallocated memory + - allocation is slow af, so is deallocation + - holes sorted by address for fast coalescing +- how do you allocate? + - first fit: take first fitting hole (MINIX 3). simplest option. + - next fit: take next fitting hole. slower than first fit in practice. + - best fit: take best fitting hole. prone to fragmentation. + - worst fit: take worst fitting hole. poor performance in practice. + - quick fit: keep holes of different sizes. poor coalescing performance. + - buddy allocation scheme: improve quick fit's coalescing performance (Linux uses this). + - originally - area of 64 "chunks" (a unit chunk is whatever you want, on UNIX it's 4 kb) + - maintain *n* lists, one per size class + - you split on powers of 2 until you get the right size + - how do you find your buddies? shout eyyyyy buddy diff --git a/content/os-notes/Threads.html b/content/os-notes/Threads.html @@ -1,4 +0,0 @@ -<?xml version="1.0" encoding="UTF-8"?> -<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> -<html><head><link rel="stylesheet" href="sitewide.css"><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/><meta name="exporter-version" content="Evernote Mac 7.6 (457297)"/><meta name="altitude" content="-1.633376002311707"/><meta name="author" content="Alex Balgavy"/><meta name="created" content="2018-12-03 16:03:12 +0000"/><meta name="latitude" content="52.33331298828125"/><meta name="longitude" content="4.866615317820796"/><meta name="source" content="desktop.mac"/><meta name="updated" content="2018-12-14 15:44:59 +0000"/><title>Threads</title></head><body><div><b>multithreaded execution: 1 process, N threads of execution</b><br/></div><ul><li><div>lightweight processes, allow space and time efficient parallelism</div></li><li><div>organised in thread groups, allow simple communication and synchronisation</div></li></ul><div style="margin-top: 1em; margin-bottom: 1em; margin-left: 40px;-en-paragraph:true;"><img src="Threads.resources/4B0A8CCE-DB68-4C2A-AB82-391EC4A0720F.png" height="564" width="1172"/><br/></div><ul><li><div style="-en-paragraph:true;"><span style="-en-paragraph:true;">threads are in the same address space of a single process </span></div></li><li><div style="-en-paragraph:true;"><span style="-en-paragraph:true;">all info exchange done via data shared bet</span><span style="-en-paragraph:true;">ween threads </span></div></li><li><div style="-en-paragraph:true;"><span style="-en-paragraph:true;">threads synchronise via simple primitives </span></div></li><li><div style="-en-paragraph:true;"><span style="-en-paragraph:true;">each thread has its own stack, hardware registers, and state </span></div></li><li><div style="-en-paragraph:true;"><span style="-en-paragraph:true;">each thread may call any OS-supported syscall on behalf of the process to which it belongs</span></div></li></ul><div style="-en-paragraph:true;"><b><br/></b></div><div style="-en-paragraph:true;"><b>User threads - pros and cons</b><br/></div><ul><li><div>(+) thread switching time is quicker, as no mode switch</div></li><li><div>(+) scalability customizability (since no in-kernel management)</div></li><li><div>(-) parallelism (blocking syscalls are problematic)</div></li><li><div>(-) transparency (typically needs app cooperation)</div></li></ul><div style="margin-top: 1em;margin-bottom: 1em;-en-paragraph:true;"><img src="Threads.resources/59DC4D4C-51F2-4152-881F-46709D0B0FBE.png" height="264" width="600"/><br/></div><div><font face="Helvetica Neue"><b><br/></b></font></div><div><font face="Helvetica Neue"><b>Hybrid implementations of threads:</b></font></div><ul><li><div>thread multiplexing: one kernel thread runs everything</div></li><ul><li><div>virtualises N user threads over 1 kernel thread</div></li><li><div>mitigates transparency and parallelization issues</div></li></ul><li><div>scheduler activation: scheduler takes care of blocking/unblocking</div></li><ul><li><div>upcalls notify userland of blocking/unblocking events</div></li><li><div>userland schedules thread accordingly</div></li></ul><li><div>pop-up threads: kernel takes care of creating threads</div></li><ul><li><div>kernel spawns new thread in response to event</div></li><li><div>thread may run at kernel or user level</div></li></ul></ul><div><font face="Courier"><br/></font></div><div><font face="Helvetica Neue"><b>POSIX threads (Pthreads)</b></font></div><div><font face="Helvetica Neue">pthread is a library interface. names may not always be the same.</font></div><div style="margin-top: 1em;margin-bottom: 1em;-en-paragraph:true;"><img src="Threads.resources/0535A578-FC28-4B35-9C61-93A76C65980D.png" height="216" width="654"/><br/></div><div><font face="Helvetica Neue"><b>Inter-process communication (IPC)</b></font></div><ul><li><div>Processes need some way to communicate - to share data during execution</div></li><li><div>No explicit cross-process sharing, data must be normally exchanged between processes</div></li><li><div>Processes need a way to synchronise -</div></li><ul><li><div>to account for dependencies</div></li><li><div>to avoid them interfering with each other</div></li></ul></ul><div><br/></div></body></html>- \ No newline at end of file diff --git a/content/os-notes/Threads.resources/4B0A8CCE-DB68-4C2A-AB82-391EC4A0720F.png b/content/os-notes/Threads/68f6da63d8006625153c23148f477ac3.png Binary files differ. diff --git a/content/os-notes/Threads.resources/0535A578-FC28-4B35-9C61-93A76C65980D.png b/content/os-notes/Threads/79ff59028342a53fae33cbd60a02dcf7.png Binary files differ. diff --git a/content/os-notes/Threads.resources/59DC4D4C-51F2-4152-881F-46709D0B0FBE.png b/content/os-notes/Threads/fe2bd8611012458563c85e4d08b7b87f.png Binary files differ. diff --git a/content/os-notes/Threads/index.md b/content/os-notes/Threads/index.md @@ -0,0 +1,51 @@ ++++ +title = 'Threads' ++++ +# Threads +## Multithreaded execution: 1 process, N threads of execution + +- lightweight processes, allow space and time efficient parallelism +- organised in thread groups, allow simple communication and synchronisation + +![](68f6da63d8006625153c23148f477ac3.png) + +- threads are in the same address space of a single process +- all info exchange done via data shared between threads +- threads synchronise via simple primitives +- each thread has its own stack, hardware registers, and state +- each thread may call any OS-supported syscall on behalf of the process to which it belongs + +## User threads - pros and cons + +- (+) thread switching time is quicker, as no mode switch +- (+) scalability customizability (since no in-kernel management) +- (-) parallelism (blocking syscalls are problematic) +- (-) transparency (typically needs app cooperation) + +![](fe2bd8611012458563c85e4d08b7b87f.png) +** +** +## Hybrid implementations of threads: + +- thread multiplexing: one kernel thread runs everything + - virtualises N user threads over 1 kernel thread + - mitigates transparency and parallelization issues +- scheduler activation: scheduler takes care of blocking/unblocking + - upcalls notify userland of blocking/unblocking events + - userland schedules thread accordingly +- pop-up threads: kernel takes care of creating threads + - kernel spawns new thread in response to event + - thread may run at kernel or user level + +## POSIX threads (Pthreads) +pthread is a library interface. names may not always be the same. + +![](79ff59028342a53fae33cbd60a02dcf7.png) + +## Inter-process communication (IPC) + +- Processes need some way to communicate - to share data during execution +- No explicit cross-process sharing, data must be normally exchanged between processes +- Processes need a way to synchronise + - to account for dependencies + - to avoid them interfering with each other diff --git a/content/os-notes/Virtual memory.html b/content/os-notes/Virtual memory.html @@ -1,16 +0,0 @@ -<?xml version="1.0" encoding="UTF-8"?> -<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> -<html><head><link rel="stylesheet" href="sitewide.css"><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/><meta name="exporter-version" content="Evernote Mac 7.6 (457297)"/><meta name="altitude" content="-1.637205004692078"/><meta name="author" content="Alex Balgavy"/><meta name="created" content="2018-12-03 15:23:12 +0000"/><meta name="latitude" content="52.33337587840298"/><meta name="longitude" content="4.866682208677656"/><meta name="source" content="desktop.mac"/><meta name="updated" content="2018-12-18 23:53:16 +0000"/><title>Virtual memory</title></head><body><div style="-en-paragraph:true;">problem: so far memory can only be given to processes in contiguous pieces</div><div>solution: another level of abstraction!</div><ul><li><div>divide physical memory and program (virtual) memory into pages of fixed size (typically 4 KB)</div></li><li><div>give program a number of virtual pages</div></li><li><div>translate virtual pages into physical pages (frames)</div></li></ul><div><br/></div><div>MMU (memory management unit)translation between virtual memory address and the physical memory address</div><ul><li><div>size of memory address space depends on architecture</div></li><li><div>OS decides how to map page tables</div></li><li><div>no need to translate offset</div></li><li><div>create illusion that every process has unlimited memory</div></li><li><div>page tables contain pages</div></li><ul><li><div>x86 page table entry -</div></li><ul><li><div>4kb page size</div></li><li><div>32-bit physical and virtual address space</div></li><li><div>32-bit page table entries (PTEs)</div></li><li><div><img src="Virtual%20memory.resources/7ED1AE58-4A6D-4429-8B9F-7AB9D701C29D.png" height="118" width="589"/></div></li><li><div><img src="Virtual%20memory.resources/77402436-57E3-4E00-A911-CF3535F85C86.png" height="200" width="293"/></div></li></ul></ul></ul><div><br/></div><div>but this leads to a lot of wasted memory, we are keeping around page table entries that aren’t used or don’t have many meaningful values</div><div><br/></div><div>so we use sparse data structures, break up the page tables (and you can legit go crazy with this)</div><div><br/></div><div>two-level page tables (x86) -</div><ul><li><div>CR3 register points to top of page table hierarchy</div></li><li><div>page tables are 'walked' by hardware MMU</div></li><li><div>first 10 bits of page frame address -- field in page directory (top-level page table). this gives the resultant page</div></li><li><div>second 10 bits of page frame address -- field in secondary table</div></li></ul><div style="margin-left: 40px;"><br/></div><div style="margin-left: 40px;"><img src="Virtual%20memory.resources/C5B5181F-70BD-4C14-A0EE-0C5E48F3965B.png" height="753" width="951"/><br/></div><div><br/></div><div>four-level page tables (x86-64) -</div><div><br/></div><div style="margin-left: 40px;"><img src="Virtual%20memory.resources/6572460B-737A-40D1-BE1F-83EBF433B894.png" height="623" width="1046"/><br/></div><div style="margin-left: 40px;"><br/></div><div>inverted page tables (IA-64) -</div><div><br/></div><div style="margin-left: 40px;"><img src="Virtual%20memory.resources/6357993F-0E12-4F10-A8B4-BB5FEB6A8D4C.png" height="346" width="604"/><br/></div><div><br/></div><div><br/></div><div>with page tables, MMU has to translate every single memory access, leads to lower performance.</div><div>try caching previous translations in a TLB and praying to God for locality.</div><div><br/></div><div><span style="font-weight: bold;">Translation Lookaside Buffer (TLB): -</span></div><ul><li><div>contains translation info for one single memory address space (process)</div></li><li><div>1 TLB per CPU, sometimes with multiple levels</div></li></ul><div><br/></div><div style="margin-left: 40px;"><img src="Virtual%20memory.resources/AD64BCE7-52D7-4307-9064-6C7621266A3A.png" height="279" width="594"/><br/></div><div style="margin-left: 40px;"><br/></div><ul><li><div>flush TLB when entries change, context switch (unless you tag with PID)</div></li><li><div>when to expect many TLB misses? after flushes, and when processes have a lack of locality</div></li><li><div>software vs hardware-managed TLB</div></li><ul><li><div>hardware - efficiency: hardware walks page tables and fills TLB</div></li><li><div>OS - flexibility: e.g. OS may preload TLB entries that it expects to need later</div></li></ul><li><div>TLB miss handling: -</div></li><ul><li><div>walk page tables to find mapping</div></li><li><div>if mapping found, fill new TLB entry (soft miss)</div></li><li><div>if mapping not found, page fault (hard miss)</div></li><li><div>OS handles page faults (similar to interrupts): -</div></li><ul><li><div>if access violation: segfault</div></li><li><div>if legal access, page fault handler needs to fix up page tables: -</div></li><ul><li><div>page is already in memory (minor page fault)</div></li><li><div>page must be fetched from disk (major page fault)</div></li></ul></ul></ul><li><div>page replacement -</div></li><ul><li><div>computer might use more virtual memory than it has physical memory</div></li><li><div>paging creates illusion of unlimited memory available to user processes</div></li><li><div>when logical page is not in memory (swapped out to file/partition), the OS has to page it in on a page fault</div></li><li><div>but because memory is not actually unlimited, the new page has to replace an old page</div></li><li><div>how should we swap out?</div></li><ul><li><div>optimal: -</div></li><ul><li><div>replace page that will be referenced as far in the future as possible</div></li><li><div>can this algorithm be implemented in practice? maybe? with a deterministic workload, you can profile it.</div></li></ul><li><div>random: -</div></li><ul><li><div>just replace a page at random</div></li></ul><li><div>not recently used (NRU): -</div></li><ul><li><div>periodically clear R/M bit for all pages</div></li><li><div>replace page at random, but prioritize those with R=0, M=0</div></li></ul><li><div>FIFO: -</div></li><ul><li><div>build queue of faulting pages and replace the head</div></li><li><div>unfortunately, oldest page might still be useful</div></li></ul></ul></ul></ul><br/></body></html>- \ No newline at end of file diff --git a/content/os-notes/Virtual memory.resources/C5B5181F-70BD-4C14-A0EE-0C5E48F3965B.png b/content/os-notes/Virtual memory/198c95a2a7f69c555676baa912028e4b.png Binary files differ. diff --git a/content/os-notes/Virtual memory.resources/6572460B-737A-40D1-BE1F-83EBF433B894.png b/content/os-notes/Virtual memory/3988f921a66ef19047b780a13ae1e66a.png Binary files differ. diff --git a/content/os-notes/Virtual memory.resources/6357993F-0E12-4F10-A8B4-BB5FEB6A8D4C.png b/content/os-notes/Virtual memory/4608c2300f7ddffc335b3ae18a963ef5.png Binary files differ. diff --git a/content/os-notes/Virtual memory.resources/7ED1AE58-4A6D-4429-8B9F-7AB9D701C29D.png b/content/os-notes/Virtual memory/4dd5cb94db2dbf5b80b56a635895825e.png Binary files differ. diff --git a/content/os-notes/Virtual memory.resources/AD64BCE7-52D7-4307-9064-6C7621266A3A.png b/content/os-notes/Virtual memory/557379a5f0c8c9909a650253e0f18be4.png Binary files differ. diff --git a/content/os-notes/Virtual memory.resources/77402436-57E3-4E00-A911-CF3535F85C86.png b/content/os-notes/Virtual memory/fd03668ada38d2f73122f7619b9b343f.png Binary files differ. diff --git a/content/os-notes/Virtual memory/index.md b/content/os-notes/Virtual memory/index.md @@ -0,0 +1,88 @@ ++++ +title = 'Virtual memory' ++++ +# Virtual memory +problem: so far memory can only be given to processes in contiguous pieces +solution: another level of abstraction! + +- divide physical memory and program (virtual) memory into pages of fixed size (typically 4 KB) +- give program a number of virtual pages +- translate virtual pages into physical pages (frames) + +MMU (memory management unit) translation between virtual memory address and the physical memory address + +- size of memory address space depends on architecture +- OS decides how to map page tables +- no need to translate offset +- create illusion that every process has unlimited memory +- page tables contain pages + - x86 page table entry + - 4kb page size + - 32-bit physical and virtual address space + - 32-bit page table entries (PTEs) + - ![](4dd5cb94db2dbf5b80b56a635895825e.png) + - ![](fd03668ada38d2f73122f7619b9b343f.png) + +but this leads to a lot of wasted memory, we are keeping around page table entries that aren’t used or don’t have many meaningful values + +so we use sparse data structures, break up the page tables (and you can legit go crazy with this) + +two-level page tables (x86) + +- CR3 register points to top of page table hierarchy +- page tables are 'walked' by hardware MMU +- first 10 bits of page frame address -- field in page directory (top-level page table). this gives the resultant page +- second 10 bits of page frame address -- field in secondary table + +![](198c95a2a7f69c555676baa912028e4b.png) + +four-level page tables (x86-64) + +![](3988f921a66ef19047b780a13ae1e66a.png) + +inverted page tables (IA-64) + +![](4608c2300f7ddffc335b3ae18a963ef5.png) + +with page tables, MMU has to translate every single memory access, leads to lower performance. + +try caching previous translations in a TLB and praying to God for locality. + +## Translation Lookaside Buffer (TLB): + +- contains translation info for one single memory address space (process) +- 1 TLB per CPU, sometimes with multiple levels + +![](557379a5f0c8c9909a650253e0f18be4.png) + +- flush TLB when entries change, context switch (unless you tag with PID) +- when to expect many TLB misses? after flushes, and when processes have a lack of locality +- software vs hardware-managed TLB + - hardware - efficiency: hardware walks page tables and fills TLB + - OS - flexibility: e.g. OS may preload TLB entries that it expects to need later +- TLB miss handling: + - walk page tables to find mapping + - if mapping found, fill new TLB entry (soft miss) + - if mapping not found, page fault (hard miss) + - OS handles page faults (similar to interrupts): + - if access violation: segfault + - if legal access, page fault handler needs to fix up page tables: + - page is already in memory (minor page fault) + - page must be fetched from disk (major page fault) +- page replacement + - computer might use more virtual memory than it has physical memory + - paging creates illusion of unlimited memory available to user processes + - when logical page is not in memory (swapped out to file/partition), the OS has to page it in on a page fault + - but because memory is not actually unlimited, the new page has to replace an old page + - how should we swap out? + - optimal: + - replace page that will be referenced as far in the future as possible + - can this algorithm be implemented in practice? maybe? with a deterministic workload, you can profile it. + - random: + - just replace a page at random + - not recently used (NRU): + - periodically clear R/M bit for all pages + - replace page at random, but prioritize those with R=0, M=0 + - FIFO: + - build queue of faulting pages and replace the head + - unfortunately, oldest page might still be useful diff --git a/content/os-notes/What is an OS_.html b/content/os-notes/What is an OS_.html @@ -1,3 +0,0 @@ -<?xml version="1.0" encoding="UTF-8"?> -<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> -<html><head><link rel="stylesheet" href="sitewide.css"><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/><meta name="exporter-version" content="Evernote Mac 7.6 (457297)"/><meta name="altitude" content="276.5"/><meta name="author" content="Alex Balgavy"/><meta name="created" content="2018-11-09 12:39:44 +0000"/><meta name="latitude" content="50.11346435546875"/><meta name="longitude" content="14.33735006690641"/><meta name="source" content="desktop.mac"/><meta name="updated" content="2018-12-03 15:34:28 +0000"/><title>What is an OS?</title></head><body><div><span style="font-weight: bold;">Some history</span></div><div>Batch systems: one job at a time</div><div>Multiprogrammed systems: store multiple jobs in memory, with an operating system that schedules, allocates, multiplexes. but one job after another, with a lot of waiting.</div><div>Time sharing: single CPU can be passed between jobs, multitasking, illusion of parallelism.</div><div><br/></div><div>You are not expected to understand this.</div><div><br/></div><div><span style="font-weight: bold;">What is an OS?</span></div><div>Kernel vs user mode:</div><div><img src="What%20is%20an%20OS_.resources/Screenshot%202018-11-09%20at%2013.47.46.png" height="460" width="794"/><br/></div><div><br/></div><div><br/></div><div>OS is an extended machine — it extends & abstracts over hardware functionality</div><div>OS is resource manager — protects unsafe use of resources, accounting/limiting</div><ul><li><div>offers functionality through syscalls</div></li><li><div>groups of syscalls offer services</div></li><li><div>processes are abstractions to create user’s program</div></li><li><div>each program/process has its own address space</div></li><li><div>data is in files, these persist over processes</div></li></ul><div><br/></div><div><span style="font-weight: bold;">Processes</span></div><ul><li><div>process represents instance of a program in execution, with a name</div></li><li><div>memory address spaces limit programs to a specific part of memory. layout depends. </div></li></ul><div><br/></div><div><span style="font-weight: bold;">Memory address space</span></div><div>very basic layout is stack (frames for function calls), data (variables), text (program code)</div><div style="margin-left: 40px;"><img src="What%20is%20an%20OS_.resources/Screenshot%202018-11-09%20at%2013.46.40.png" height="391" width="424"/><br/></div><div><br/></div><div><span style="font-weight: bold;">Structure of the OS</span></div><div><br/></div><div><img src="What%20is%20an%20OS_.resources/screenshot.png" height="368" width="763"/><br/></div><div><br/></div><div><span style="font-weight: bold;">System calls</span></div><ul><li><div>interface offered by OS to apps for service requests</div></li><li><div>interface depends on OS and hardware, so encapsulate syscall logic in libc (POSIX standard)</div></li></ul><div/><div><br/></div></body></html>- \ No newline at end of file diff --git a/content/os-notes/What is an OS_.resources/Screenshot 2018-11-09 at 13.47.46.png b/content/os-notes/What is an OS_/640094eca22df17a21654aee082f8780.png Binary files differ. diff --git a/content/os-notes/What is an OS_.resources/screenshot.png b/content/os-notes/What is an OS_/c4fe4a6540f1d2f4165a65943a276a74.png Binary files differ. diff --git a/content/os-notes/What is an OS_.resources/Screenshot 2018-11-09 at 13.46.40.png b/content/os-notes/What is an OS_/cbc52110fdee2f262361bdee3d6f21ba.png Binary files differ. diff --git a/content/os-notes/What is an OS_/index.md b/content/os-notes/What is an OS_/index.md @@ -0,0 +1,45 @@ ++++ +title = 'What is an OS?' ++++ +# What is an OS? +## Some history +Batch systems: one job at a time + +Multiprogrammed systems: store multiple jobs in memory, with an operating system that schedules, allocates, multiplexes. but one job after another, with a lot of waiting. + +Time sharing: single CPU can be passed between jobs, multitasking, illusion of parallelism. + +You are not expected to understand this. + +## What is an OS? +Kernel vs user mode: +![Screenshot 2018-11-09 at 13.47.46.png](640094eca22df17a21654aee082f8780.png) + +OS is an extended machine — it extends & abstracts over hardware functionality +OS is resource manager — protects unsafe use of resources, accounting/limiting + +- offers functionality through syscalls +- groups of syscalls offer services +- processes are abstractions to create user’s program +- each program/process has its own address space +- data is in files, these persist over processes + +## Processes + +- process represents instance of a program in execution, with a name +- memory address spaces limit programs to a specific part of memory. layout depends. + +## Memory address space + +very basic layout is stack (frames for function calls), data (variables), text (program code) + +![Screenshot 2018-11-09 at 13.46.40.png](cbc52110fdee2f262361bdee3d6f21ba.png) + +## Structure of the OS + +![screenshot.png](c4fe4a6540f1d2f4165a65943a276a74.png) + +## System calls + +- interface offered by OS to apps for service requests +- interface depends on OS and hardware, so encapsulate syscall logic in libc (POSIX standard) diff --git a/content/os-notes/_index.md b/content/os-notes/_index.md @@ -0,0 +1,31 @@ ++++ +title = 'Operating Systems' ++++ +# Operating Systems +1. Introduction + 1. [What is an OS?](what-is-an-os) + 2. [Files](files-highlevel) + 3. [Kernels](kernels) + 4. [System calls](system-calls) + 5. [Process management](process-management) +2. Processes & Threads + 1. [Process model](process-model) + 2. [Interrupt Handling & Scheduling](interrupt-handling-scheduling) + 3. [Signal handling](signal-handling) + 4. [Threads](threads) + 5. [Race conditions & mutual exclusion](race-conditions-mutual-exclusion) + 6. [Scheduling](scheduling) +3. Memory Management + 1. [Basics of memory: no abstraction](basics-of-memory-no-abstraction) + 2. [The Memory Address Space](the-memory-address-space) + 3. [Virtual memory](virtual-memory) + 4. [Design issues](design-issues) +4. File systems + 1. [Files](files) + 2. [File system layout](file-system-layout) + 3. [Reliability & Performance](reliability-performance) +5. I/O + 1. [Principles of IO hardware](principles-of-io-hardware) + 2. [Principles of IO software](principles-of-io-software) + 3. [Devices](devices) +6. [Deadlocks](deadlocks) diff --git a/content/os-notes/files-highlevel.md b/content/os-notes/files-highlevel.md @@ -0,0 +1,28 @@ ++++ +title = 'Files' ++++ +# Files + +Files: +- abstraction of storage device (possibly real) +- can read/write +- starts at root dir “/" +- absolute (/Users/user/Documents) and relative (../Documents) paths + +File permissions: + +- throuh bit permission tuples (read-write-execute) +- x for directories is something like ‘cd' + +Special files: + +- everything is a file (descriptor) +- hardware: block (disks), character (serial port) +- symlinks to link to other files +- named/anonymous FIFOS (sockets/pipes) + +Pipes: + +- pseudofiles for processes to communicate over FIFO channel +- has to be set up in advance +- looks like a “normal” file+ \ No newline at end of file diff --git a/content/os-notes/index.html b/content/os-notes/index.html @@ -1,122 +0,0 @@ -<?xml version="1.0" encoding="UTF-8"?> -<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> -<html> - -<head><link rel="stylesheet" href="sitewide.css"> - <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /> - <meta name="exporter-version" content="Evernote Mac 7.6 (457297)" /> - <meta name="altitude" content="-1.438530921936035" /> - <meta name="author" content="Alex Balgavy" /> - <meta name="created" content="2018-12-03 13:24:38 +0000" /> - <meta name="latitude" content="52.3333740234375" /> - <meta name="longitude" content="4.866623084957805" /> - <meta name="source" content="desktop.mac" /> - <meta name="updated" content="2018-12-19 00:28:44 +0000" /> - <title>TOC Operating Systems</title> -</head> - -<body> - <nav> -<a href="http://thezeroalpha.github.io">Homepage</a> -</nav> - - <h1>Operating Systems Notes</h1> - <h3>Alex Balgavy, <a href="https://github.com/thezeroalpha/os-notes">Github repo</a></h3> - <div><p>Here's a <a href="https://docs.google.com/document/d/16mpIPqxGaIglM0NQz3-7jIfv8WbklmG5wyRmk6PSprI/edit">summary document</a> (very detailed & quite long, 79 pages). <a href="Summary Operating Systems.pdf">PDF link in case the doc is gone.</a></p></div> - <ol> - <li> - <div>Introduction</div> - </li> - <ol> - <li> -<div><a href="What is an OS_.html">What is an OS?</a></div> - </li> - <li> -<div><a href="Files.html">Files</a></div> - </li> - <li> -<div><a href="Kernels.html">Kernels</a></div> - </li> - <li> -<div><a href="System calls.html">System calls</a></div> - </li> - <li> -<div><a href="Process management.html">Process management</a></div> - </li> - </ol> - <li> - <div>Processes & Threads</div> - </li> - <ol> - <li> -<div><a href="Process model.html">Process model</a></div> - </li> - <li> -<div><a href="Interrupt Handling & Scheduling.html">Interrupt Handling & Scheduling</a></div> - </li> - <li> -<div><a href="Signal handling.html">Signal handling</a></div> - </li> - <li> -<div><a href="Threads.html">Threads</a></div> - </li> - <li> -<div><a href="Race conditions & mutual exclusion.html">Race conditions & mutual exclusion</a></div> - </li> - <li> -<div><a href="Scheduling.html">Scheduling</a></div> - </li> - </ol> - <li> - <div>Memory Management</div> - </li> - <ol> - <li> -<div><a href="Basics of memory: no abstraction.html">Basics of memory: no abstraction</a></div> - </li> - <li> -<div><a href="The Memory Address Space.html">The Memory Address Space</a></div> - </li> - <li> -<div><a href="Virtual memory.html">Virtual memory</a></div> - </li> - <li> -<div><a href="Design issues.html">Design issues</a></div> - </li> - </ol> - <li> - <div>File systems</div> - </li> - <ol> - <li> -<div><a href="Files.html">Files</a></div> - </li> - <li> -<div><a href="File system layout.html">File system layout</a></div> - </li> - <li> -<div><a href="Reliability & Performance.html">Reliability & Performance</a></div> - </li> - </ol> - <li> - <div>I/O</div> - </li> - <ol> - <li> -<div><a href="Principles of IO hardware.html">Principles of IO hardware</a></div> - </li> - <li> -<div><a href="Principles of IO software.html">Principles of IO software</a></div> - </li> - <li> -<div><a href="Devices.html">Devices</a></div> - </li> - </ol> - <li> -<div><a href="Deadlocks.html">Deadlocks</a></div> - </li> - </ol> - <div><br /></div> -</body> - -</html> diff --git a/content/os-notes/sitewide.css b/content/os-notes/sitewide.css @@ -1,34 +0,0 @@ -@charset 'UTF-8'; -@font-face{font-family:'FontAwesome';src:url('font/fontawesome-webfont.eot?v=4.0.1');src:url('font/fontawesome-webfont.eot?#iefix&v=4.0.1') format('embedded-opentype'),url('font/fontawesome-webfont.woff?v=4.0.1') format('woff'),url('font/fontawesome-webfont.ttf?v=4.0.1') format('truetype'),url('font/fontawesome-webfont.svg?v=4.0.1#fontawesomeregular') format('svg');font-weight:normal;font-style:normal} - -body { - margin: 0px; - padding: 1em; - background: #f3f2ed; - font-family: 'Lato', sans-serif; - font-size: 12pt; - font-weight: 300; - color: #8A8A8A; - padding-left: 50px; - line-height: 1.5em; -} -h1, h2, h3 { - margin: 0px; - padding: 0px; - font-weight: 300; - text-align: center; - line-height: 1.5em; -} -h3 { - font-style: italic; -} -a { - color: #D1551F; - } -a:hover { - color: #AF440F; -} - strong { - font-weight: 700; - color: #2A2A2A; - }