Page Tables

IMPLEMENTATION OF SWAPPING IN VIRTUAL MEMORY

Each address space within the system has associated with it a page table and a disk map. Each of these two tables describe an *entire* address space. The page table identifies which pages are in main memory, and in which page frames those pages are located. The disk map identifies where all the pages are on the disk. The entire address space is on the disk, but only a subset of it is resident in main memory.

(Note: additional flags may also be present, but these are the significant ones.)

If the P flag in the page table is set, the page is currently in main memory. Its location in memory is determined by the F value; F is the number of the page frame in which the page is located. If the P flag is clear, the address mapper will generate a page fault if a process attempts reference to the page; the page fault handler will use the disk map to locate that page on the disk and swap it in. Each address space must have a complete mapping on disk (i.e. the entire addess space must be available on disk). In the event that a process is swapped out entirely, none of its address space will be in main memory; all of it will be in secondary storage. The set of pages shown in the page table as present are a subset of the pages on the disk. Those with the M flags set are not exact copies any more and must be copied back to disk before they can be deleted from main memory. It is possible to combine the page table and disk map by adding the disk address field to the page table:

But this is generally not done because it makes the page table bigger and may slow down address translation. An address space on disk is implemented as a file. A file of the address space's size is created and identified as a virtual memory:

d = CREATE_VM(file)

The Create Virtual Memory routine creates the necessary support structures for the space and returns a pointer, d, that identifies the virtual memory. Thus, a process having virtual memory d would use page table PT[d] and disk map DM[d]. When the process exits, the operating system must remove the address space from disk and the page table using the command

DELETE_VM(d)

Address space id's and process id's are not synonymous. Several threads (different process id's) can be operating in the same address space (same address space id); they all use the same page table and disk map.

When a page is to be swapped in, the system locates a candidate frame from the list of page frames. Normally, the least-recently-used (LRU) frame is considered to best candidate. The frame number is used as an index into yet another table the system maintains: the frame table. The frame table describes how the main memory is being used; thus, there is only one frame table for the entire system. Within the frame table, there is one slot for each page frame in main memory. Each slot identifies the address space (d, from above) using the frame and the page, x, of that address space within the frame.

(d,x) = FRAMES(y)

y = PT(d,x)

Note that the frame number, F, from the page table points to a frame in main memory. The disk address A in the disk map points to a disk sector that contains the page in the on-disk copy of the virtual memory.

When a page fault occurs, the page fault handler procedure is invoked in the address space of the faulting process. It's overall job is: "make the missing page present". When it returns, the faulting process retries the addressing operation that just failed; now it will succeed.

The job "make the missing page present" consists of these steps:

1) Using the replacement algorithm, get the number of a frame that will be used for the incoming page: y = REPLACE.

2) Swap the page currently in frame y to its slot on the disk; that slot is DM[FRAMES[y]].

3) Swap the requested page x from its slot on the disk into frame y; that slot is DM[d,x].

4) Update the page table entry PT[d,x] so that P=1 and F=y, and update FRAMES[y] to (d,x).
5) Return.

Steps 2 and 3 involve an interaction with the disk driver. The protocol for this has been described previously. When the step is started, it will not complete until the disk has sent back its completion interrupt signal. Since these steps involve swapping, it is worthwhile to eliminate one or both of them whenever possible.

Step two may be skipped if the identified page frame has not been modified (written to) since it was loaded or last written to disk. As a result, when identifying candidate frames, the system will give preference to unmodified pages.

Step three can be eliminated in case the requested page is still loaded in main memory. This can happen if the replacement algorithm marks the page as not present, but the owner process refers to it again before it is swapped out. The test for this is that PR[d,x].F = y and FRAMES[y] = (d,x).

In step five, the processor is returned to the state in which it was running when the page fault occured. Often, this is not a trivial task. Refer to Tannenbaum [T1] for a superficial discussion on the issues involved.

In order to facilitate the above actions and make the likelihood of unneeded swaps as low as possible, the main memory page frames can be divided into four groups by the combination of their (P,M) bits. The (0,0) frames are called the "pool" of frames ready to be used to satisfy page faults. The replacement algorithm is instantiated as a background process that watches for unused frames and marks them as P=0; if a newly marked page has M=1 (frame state is (0,1)), a request to copy it out to disk is queued up; when that request is complete the frame is moved to the (0,0) state. With this structure, the first step of "make page present" is simply to take the first frame from the list of (0,0) frames. (That list must be semaphore protected so that the caller waits until there is at least one frame in the (0,0) state, a condition that is guaranteed because the replacement policy is working in the background to guarantee that there are (0,0) pages.) [D7]

--P.J. Denning and Steve Coile

Also see [D1] pp. 157-159 and [T1] pp.89-96.