lectures.alex.balgavy.eu

Lecture notes from university.
git clone git://git.alex.balgavy.eu/lectures.alex.balgavy.eu.git
Log | Files | Refs | Submodules

Virtual memory.html (6382B)


      1 <?xml version="1.0" encoding="UTF-8"?>
      2 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
      3 <html><head><link rel="stylesheet" href="sitewide.css"><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/><meta name="exporter-version" content="Evernote Mac 7.6 (457297)"/><meta name="altitude" content="-1.637205004692078"/><meta name="author" content="Alex Balgavy"/><meta name="created" content="2018-12-03 15:23:12 +0000"/><meta name="latitude" content="52.33337587840298"/><meta name="longitude" content="4.866682208677656"/><meta name="source" content="desktop.mac"/><meta name="updated" content="2018-12-18 23:53:16 +0000"/><title>Virtual memory</title></head><body><div style="-en-paragraph:true;">problem: so far memory can only be given to processes in contiguous pieces</div><div>solution: another level of abstraction!</div><ul><li><div>divide physical memory and program (virtual) memory into pages of fixed size (typically 4 KB)</div></li><li><div>give program a number of virtual pages</div></li><li><div>translate virtual pages into physical pages (frames)</div></li></ul><div><br/></div><div>MMU (memory management unit)translation between virtual memory address and the physical memory address</div><ul><li><div>size of memory address space depends on architecture</div></li><li><div>OS decides how to map page tables</div></li><li><div>no need to translate offset</div></li><li><div>create illusion that every process has unlimited memory</div></li><li><div>page tables contain pages</div></li><ul><li><div>x86 page table entry
      4 </div></li><ul><li><div>4kb page size</div></li><li><div>32-bit physical and virtual address space</div></li><li><div>32-bit page table entries (PTEs)</div></li><li><div><img src="Virtual%20memory.resources/7ED1AE58-4A6D-4429-8B9F-7AB9D701C29D.png" height="118" width="589"/></div></li><li><div><img src="Virtual%20memory.resources/77402436-57E3-4E00-A911-CF3535F85C86.png" height="200" width="293"/></div></li></ul></ul></ul><div><br/></div><div>but this leads to a lot of wasted memory, we are keeping around page table entries that aren’t used or don’t have many meaningful values</div><div><br/></div><div>so we use sparse data structures, break up the page tables (and you can legit go crazy with this)</div><div><br/></div><div>two-level page tables (x86)
      5 </div><ul><li><div>CR3 register points to top of page table hierarchy</div></li><li><div>page tables are 'walked' by hardware MMU</div></li><li><div>first 10 bits of page frame address -- field in page directory (top-level page table). this gives the resultant page</div></li><li><div>second 10 bits of page frame address -- field in secondary table</div></li></ul><div style="margin-left: 40px;"><br/></div><div style="margin-left: 40px;"><img src="Virtual%20memory.resources/C5B5181F-70BD-4C14-A0EE-0C5E48F3965B.png" height="753" width="951"/><br/></div><div><br/></div><div>four-level page tables (x86-64)
      6 </div><div><br/></div><div style="margin-left: 40px;"><img src="Virtual%20memory.resources/6572460B-737A-40D1-BE1F-83EBF433B894.png" height="623" width="1046"/><br/></div><div style="margin-left: 40px;"><br/></div><div>inverted page tables (IA-64)
      7 </div><div><br/></div><div style="margin-left: 40px;"><img src="Virtual%20memory.resources/6357993F-0E12-4F10-A8B4-BB5FEB6A8D4C.png" height="346" width="604"/><br/></div><div><br/></div><div><br/></div><div>with page tables, MMU has to translate every single memory access, leads to lower performance.</div><div>try caching previous translations in a TLB and praying to God for locality.</div><div><br/></div><div><span style="font-weight: bold;">Translation Lookaside Buffer (TLB):
      8 </span></div><ul><li><div>contains translation info for one single memory address space (process)</div></li><li><div>1 TLB per CPU, sometimes with multiple levels</div></li></ul><div><br/></div><div style="margin-left: 40px;"><img src="Virtual%20memory.resources/AD64BCE7-52D7-4307-9064-6C7621266A3A.png" height="279" width="594"/><br/></div><div style="margin-left: 40px;"><br/></div><ul><li><div>flush TLB when entries change, context switch (unless you tag with PID)</div></li><li><div>when to expect many TLB misses? after flushes, and when processes have a lack of locality</div></li><li><div>software vs hardware-managed TLB</div></li><ul><li><div>hardware - efficiency: hardware walks page tables and fills TLB</div></li><li><div>OS - flexibility: e.g. OS may preload TLB entries that it expects to need later</div></li></ul><li><div>TLB miss handling:
      9 </div></li><ul><li><div>walk page tables to find mapping</div></li><li><div>if mapping found, fill new TLB entry (soft miss)</div></li><li><div>if mapping not found, page fault (hard miss)</div></li><li><div>OS handles page faults (similar to interrupts):
     10 </div></li><ul><li><div>if access violation: segfault</div></li><li><div>if legal access, page fault handler needs to fix up page tables:
     11 </div></li><ul><li><div>page is already in memory (minor page fault)</div></li><li><div>page must be fetched from disk (major page fault)</div></li></ul></ul></ul><li><div>page replacement
     12 </div></li><ul><li><div>computer might use more virtual memory than it has physical memory</div></li><li><div>paging creates illusion of unlimited memory available to user processes</div></li><li><div>when logical page is not in memory (swapped out to file/partition), the OS has to page it in on a page fault</div></li><li><div>but because memory is not actually unlimited, the new page has to replace an old page</div></li><li><div>how should we swap out?</div></li><ul><li><div>optimal:
     13 </div></li><ul><li><div>replace page that will be referenced as far in the future as possible</div></li><li><div>can this algorithm be implemented in practice? maybe? with a deterministic workload, you can profile it.</div></li></ul><li><div>random:
     14 </div></li><ul><li><div>just replace a page at random</div></li></ul><li><div>not recently used (NRU):
     15 </div></li><ul><li><div>periodically clear R/M bit for all pages</div></li><li><div>replace page at random, but prioritize those with R=0, M=0</div></li></ul><li><div>FIFO:
     16 </div></li><ul><li><div>build queue of faulting pages and replace the head</div></li><li><div>unfortunately, oldest page might still be useful</div></li></ul></ul></ul></ul><br/></body></html>