index.md - lectures.alex.balgavy.eu - Lecture notes from university.

index.md (3859B)
      1 +++
      2 title = 'Virtual memory'
      3 +++
      4 # Virtual memory
      5 problem: so far memory can only be given to processes in contiguous pieces
      6 solution: another level of abstraction!
      7 
      8 - divide physical memory and program (virtual) memory into pages of fixed size (typically 4 KB)
      9 - give program a number of virtual pages
     10 - translate virtual pages into physical pages (frames)
     11 
     12 MMU (memory management unit) translation between virtual memory address and the physical memory address
     13 
     14 - size of memory address space depends on architecture
     15 - OS decides how to map page tables
     16 - no need to translate offset
     17 - create illusion that every process has unlimited memory
     18 - page tables contain pages
     19     - x86 page table entry
     20         - 4kb page size
     21         - 32-bit physical and virtual address space
     22         - 32-bit page table entries (PTEs)
     23         - ![](4dd5cb94db2dbf5b80b56a635895825e.png)
     24         - ![](fd03668ada38d2f73122f7619b9b343f.png)
     25 
     26 but this leads to a lot of wasted memory, we are keeping around page table entries that aren’t used or don’t have many meaningful values
     27 
     28 so we use sparse data structures, break up the page tables (and you can legit go crazy with this)
     29 
     30 two-level page tables (x86)
     31 
     32 - CR3 register points to top of page table hierarchy
     33 - page tables are 'walked' by hardware MMU
     34 - first 10 bits of page frame address -- field in page directory (top-level page table). this gives the resultant page
     35 - second 10 bits of page frame address -- field in secondary table
     36 
     37 ![](198c95a2a7f69c555676baa912028e4b.png)
     38 
     39 four-level page tables (x86-64)
     40 
     41 ![](3988f921a66ef19047b780a13ae1e66a.png)
     42 
     43 inverted page tables (IA-64)
     44 
     45 ![](4608c2300f7ddffc335b3ae18a963ef5.png)
     46 
     47 with page tables, MMU has to translate every single memory access, leads to lower performance.
     48 
     49 try caching previous translations in a TLB and praying to God for locality.
     50 
     51 ## Translation Lookaside Buffer (TLB):
     52 
     53 - contains translation info for one single memory address space (process)
     54 - 1 TLB per CPU, sometimes with multiple levels
     55 
     56 ![](557379a5f0c8c9909a650253e0f18be4.png)
     57 
     58 - flush TLB when entries change, context switch (unless you tag with PID)
     59 - when to expect many TLB misses? after flushes, and when processes have a lack of locality
     60 - software vs hardware-managed TLB
     61     - hardware - efficiency: hardware walks page tables and fills TLB
     62     - OS - flexibility: e.g. OS may preload TLB entries that it expects to need later
     63 - TLB miss handling:
     64     - walk page tables to find mapping
     65     - if mapping found, fill new TLB entry (soft miss)
     66     - if mapping not found, page fault (hard miss)
     67     - OS handles page faults (similar to interrupts):
     68         - if access violation: segfault
     69         - if legal access, page fault handler needs to fix up page tables:
     70             - page is already in memory (minor page fault)
     71             - page must be fetched from disk (major page fault)
     72 - page replacement
     73     - computer might use more virtual memory than it has physical memory
     74     - paging creates illusion of unlimited memory available to user processes
     75     - when logical page is not in memory (swapped out to file/partition), the OS has to page it in on a page fault
     76     - but because memory is not actually unlimited, the new page has to replace an old page
     77     - how should we swap out?
     78         - optimal:
     79             - replace page that will be referenced as far in the future as possible
     80             - can this algorithm be implemented in practice? maybe? with a deterministic workload, you can profile it.
     81         - random:
     82             - just replace a page at random
     83         - not recently used (NRU):
     84             - periodically clear R/M bit for all pages
     85             - replace page at random, but prioritize those with R=0, M=0
     86         - FIFO:
     87             - build queue of faulting pages and replace the head
     88             - unfortunately, oldest page might still be useful
	lectures.alex.balgavy.eu Lecture notes from university.
	git clone git://git.alex.balgavy.eu/lectures.alex.balgavy.eu.git
	Log \| Files \| Refs \| Submodules