cache
Temporary computer storage used for quick retrieval of data in order to increase processing speed. The cached data can be stored in a reserved area of RAM, a special cache chip (separate from the CPU) that provides faster access than RAM, or on the disk drive. By keeping frequently accessed data in a rapidly accessible place, the computer can respond quickly to requests for those data without having to perform time-consuming searches of RAM or hard drives. Since a “stale” cache will contain data that have been superseded by later information, the cached data must be refreshed periodically.
Pronounced "cash." A cache is used to speed up data transfer and may be either temporary or permanent. Memory and disk caches are in every computer to speed up instruction execution and data retrieval and updating. These temporary caches serve as staging areas, and their contents are constantly changing.
Browser caches and Internet caches store copies of Web pages retrieved by the user for some period of time in order to speed up retrieval the next time the same page is requested (see Web cache and browser cache). See also router cache.
Following are descriptions of the traditional memory and disk caches that are common in all computers.
Memory Caches
A memory cache, or "CPU cache," is a memory bank that bridges main memory and the CPU. It is faster than main memory and allows instructions to be executed and data to be read and written at higher speed. Instructions and data are transferred from main memory to the cache in fixed blocks, known as cache "lines," using some kind of look-ahead algorithm. See
cache line.
Temporal and Spatial (Time and Space)
Caches take advantage of "temporal locality," which means the same data item is often reused many times. They also benefit from "spatial locality," wherein the next instruction to be executed or the next data item to be processed is likely to be the next in line. The more often the same data item is processed or the more sequential the instructions or data, the greater the chance for a "cache hit." If the next item is not in the cache, a "cache miss" occurs, and the CPU has to go to main memory to retrieve it.
Level 1 and Level 2
A level 1 (L1) cache is a memory bank built into the CPU chip. A level 2 cache (L2) is a secondary staging area that feeds the L1 cache. Increasing the size of the L2 cache may speed up some applications but have no effect on others. L2 may be built into the CPU chip, reside on a separate chip in a multichip package module (see
MCP) or be a separate bank of chips on the motherboard. Caches are typically static RAM (SRAM), while main memory is generally some variety of dynamic RAM (DRAM). See
SRAM and
DRAM.
 |
| L1 and L2 Caches |
|---|
| The whole idea is to keep staging more instructions and data in a high-speed memory closer to the CPU. |
Disk Caches
A disk cache is a section of main memory or memory on the disk controller board that bridges the disk and the CPU. When the disk is read, a larger block of data is copied into the cache than is immediately required. If subsequent reads find the data already stored in the cache, there is no need to retrieve it from the disk, which is slower to access.
If the cache is used for writing, data are queued up at high speed and then written to disk during idle machine cycles by the caching program. If the cache is built into the hardware, the disk controller figures out when to do it. See
cache coherency,
write back cache,
write through cache,
pipeline burst cache,
lookaside cache,
inline cache,
backside cache and
NV cache.
 |
| Disk Cache |
|---|
| Disk caches are usually just a part of main memory made up of common dynamic RAM (DRAM) chips, whereas memory caches (CPU caches) use higher-speed static RAM (SRAM) chips. |
| (memory management) | cache - /kash/ A small fast memory holding
recently accessed data, designed to speed up subsequent access
to the same data. Most often applied to processor-memory
access but also used for a local copy of data accessible over
a network etc.
When data is read from, or written to, main memory a copy is
also saved in the cache, along with the associated main memory
address. The cache monitors addresses of subsequent reads to
see if the required data is already in the cache. If it is (a
cache hit) then it is returned immediately and the main
memory read is aborted (or not started). If the data is not
cached (a cache miss) then it is fetched from main memory
and also saved in the cache.
The cache is built from faster memory chips than main memory
so a cache hit takes much less time to complete than a normal
memory access. The cache may be located on the same
integrated circuit as the CPU, in order to further reduce
the access time. In this case it is often known as primary cache since there may be a larger, slower secondary cache
outside the CPU chip.
The most important characteristic of a cache is its hit rate
- the fraction of all memory accesses which are satisfied from
the cache. This in turn depends on the cache design but
mostly on its size relative to the main memory. The size is
limited by the cost of fast memory chips.
The hit rate also depends on the access pattern of the
particular program being run (the sequence of addresses being
read and written). Caches rely on two properties of the
access patterns of most programs: temporal locality - if
something is accessed once, it is likely to be accessed again
soon, and spatial locality - if one memory location is
accessed then nearby memory locations are also likely to be
accessed. In order to exploit spatial locality, caches often
operate on several words at a time, a "cache line" or "cache
block". Main memory reads and writes are whole cache lines.
When the processor wants to write to main memory, the data is
first written to the cache on the assumption that the
processor will probably read it again soon. Various different
policies are used. In a write-through cache, data is
written to main memory at the same time as it is cached. In a
write-back cache it is only written to main memory when it
is forced out of the cache.
If all accesses were writes then, with a write-through policy,
every write to the cache would necessitate a main memory
write, thus slowing the system down to main memory speed.
However, statistically, most accesses are reads and most of
these will be satisfied from the cache. Write-through is
simpler than write-back because an entry that is to be
replaced can just be overwritten in the cache as it will
already have been copied to main memory whereas write-back
requires the cache to initiate a main memory write of the
flushed entry followed (for a processor read) by a main memory
read. However, write-back is more efficient because an entry
may be written many times in the cache without a main memory
access.
When the cache is full and it is desired to cache another line
of data then a cache entry is selected to be written back to
main memory or "flushed". The new line is then put in its
place. Which entry is chosen to be flushed is determined by a
"replacement algorithm".
Some processors have separate instruction and data caches.
Both can be active at the same time, allowing an instruction
fetch to overlap with a data read or write. This separation
also avoids the possibility of bad cache conflict between
say the instructions in a loop and some data in an array which
is accessed by that loop.
See also direct mapped cache, fully associative cache,
sector mapping, set associative cache. | |