What is Virtual Memory?
Translation Lookaside Buffer
mation, the page table also contains status information
including a dirty bit that indicates when a page held in
memory has been modified. If modified, the page must be
saved to the disk before being flushed to make room for a
new virtual page. Otherwise, the page can be flushed without further action. Given a 4 KB page size and a 32-bit
address space, each process has access to 220 = 1,048,576
pages. With 256 PIDs, a brute-force page table would contain more than 268 million entries! There are a variety of
schemes to reduce page table size, but there is no escaping the fact that a page table will be large. Page table management schemes are largely an issue of OS architecture
and are outside the scope of this discussion. The fact that
the page table is large and is parsed by software means
that the mapping process will be extremely slow without
hardware assistance. Every access to virtual memory,
in other words almost every access performed on the
computer, requires mapping, which makes hardware
acceleration critical to the viability of virtual memory.
F
o
r
E
l
e
c
t
r
o
n
i
c
s
NUTS & VOLTS
E
v
e
r
y
t
h
i
n
g
60
Within the MMU is a translation lookaside buffer
(TLB), a small, fully associative cache that allows the
MMU to rapidly locate recently accessed virtual page
mappings. Typical sizes for a TLB are just 16 to 64
entries because of the complexity of implementing a
fast fully associative cache. When a process is first
spawned, it has not yet performed virtual memory accesses, so its first access will result in a TLB miss. When a TLB
miss occurs, an exception is generated that invokes the
kernel's memory management routine to parse the page
table in search of the correct physical address mapping.
The kernel routine loads a TLB entry with the mapping
information and exits. On subsequent memory accesses,
the TLB will hit some and miss some. It is hoped that the
ratio of hits to misses will decline rapidly as the process
executes.
As more processes actively vie for resources in a
multi-tasking system, they may begin to fight each other
for scarce TLB entries. The resources and architecture of
a computer must be properly matched to its intended
application. A typical desktop or embedded computer
may get along fine with a small TLB because it may not
have many demanding processes running concurrently. A
more powerful computer designed to simultaneously run
many memory intensive processes may require a larger
TLB to take full advantage of its microprocessor and
memory resources. The ever-present trade-off between
performance and cost does not go away!
The TLB is usually located between the microprocessor and its cache subsystem, as shown in Figure 3, such
that physical addresses are cached rather than virtual
addresses. Such an arrangement adds latency to microprocessor transactions because the virtual-to-physical
mapping must take place before the L1 cache can
respond. However, a TLB can be made very fast
because of its small size, thereby limiting its time penalty
on transactions.
FIGURE 3. Location of TLB
Virtual Memory for the Masses
Years ago, virtual memory was found only on large,
expensive computers. As more gates were squeezed into
integrated circuits, the MMU became an economical complement to the microprocessor. Modern operating systems,
and the applications that run on them, gain substantial functional and performance enhancements, from this technology that was once reserved for mainframe computing. NV
About the Author
Mark Balch is the author of Complete Digital Design. He is an electrical
engineer in Silicon Valley, CA, who designs high-performance computer-networking hardware. His responsibilities have included PCB, FPGA, and
ASIC design. Mark has designed products in the fields of HDTV, consumer
electronics, and industrial computers. Mark holds a bachelor's degree in
electrical engineering from The Cooper Union in New York City. He can be
reached via email at mark_balch@hotmail.com.
FEBRUARY 2004