Linux Memory Management – Virtual Memory and Demand Paging
Memory management is one of the most complex activity done by Linux kernel. It has various concepts/issues associated with it. This article is part of our on-going UNIX kernel overview series. In the previous article of the kernel series, we discussed about the UNIX process overview, and Reentrant Kernels. In this article we will try to touch base on virtual memory and demand paging as these are some of the important concepts related to memory management.
Virtual Memory
- To start, we must first understand that virtual memory is a layer of memory addresses that map to physical addresses.
- In virtual memory model, when a processor executes a program instruction, it reads the instruction from virtual memory and executes it.
- But before executing the instruction, it first converts the virtual memory address into physical address.
- This conversion is done based on the mapping of virtual to physical addresses that is done based on the mapping information contained in the page tables (that are maintained by OS).
The virtual and physical memory is divided into fixed length chunks known as pages. In this paged model, a virtual address can be divided into two parts :
When ever the processor encounters a virtual address, it extracts the virtual page frame number out of it. Then it translates this virtual page frame number into a physical page frame number and the offset parts helps it to go to the exact address in the physical page. This translation of addresses is done through the page tables.
Theoretically we can consider a page table to contain the following information :
- A flag that describes whether the entry is valid or not
- The physical page frame number as described by this entry
- Access information regarding the page (like read-only, read-write etc)
A page table is accessed through virtual page frame number using it as offset for entries in the page table. For example, a virtual page frame number of ‘2’ points to the entry ‘1’ in the page table (the entry numbers begin with ‘0’ ).
In the image below, VPFN stands for Virtual page frame number, and PFN indicates the physical page frame number.
It may happen that a processor goes to a processes page table entry with a virtual page frame number and finds the entry as invalid. In this case it is the processor’s responsibility to pass the control to kernel and ask it to fix the problem. Different processors pass the control in different ways but this phenomenon is known as a ‘page fault’. But if the entry was valid then processor takes the physical page frame number, multiplies with the size of the page to get the base address of the physical page and then adds the offset to get to the exact physical address.
So now we understand that through the concept of virtual memory, each process thinks that it has all range of virtual address at its disposal and hence this concepts make the system appear as if it has more physical memory than actually available.
Demand Paging
In the previous sectioned we learned that if the processor goes to the processes page table with a virtual page frame number for which no entry was present in the table then two cases arise.
- Either the process has tried to access an invalid memory address
- The physical page corresponding to the virtual address was not loaded into physical memory
Out of the two cases above, the case 1 is the case where the process tries to memory address which it is not allowed. In this case a page fault is generated and the kernel terminates the process.
While in case ‘2’, as already explained, the physical page corresponding to the virtual address is not yet loaded into physical memory. In this case also a page fault is generated and the kernel then tries to bring the required memory page into physical memory from hard disk.
Since this operation of bringing a page from hard disk into physical memory is time consuming so by this time a context switch between processes happens and some other process is brought into execution. Meanwhile the page of the earlier process is brought into physical memory and the page tables are updated and then this process is brought back into execution again from the same instruction that caused the ‘page fault’.
This is known as demand paging where all the memory pages corresponding to a process are not present in the physical memory at any given time. This saves the physical memory from clogging up with non-required memory pages while when necessary these pages can be brought into physical memory through page fault (as explained above).