Cache stampede
Encyclopedia
A cache stampede is a type of cascading failure
that can occur when massively parallel computing
systems with caching mechanisms come under very high load. This behaviour is sometimes also called dog-piling.
To see how cache stampedes can occur, consider a web server
which uses memcached
to cache rendered pages for some period of time, to ease the load on the system. Under particularly high load to a single URL, the system remains responsive so long as the resource remains cached, with requests simply being handled by accessing the cached copy, without carrying out the expensive rendering operation.
Under low load, cache misses will simply result in a single recalculation of the expensive rendering operation, and the system will then continue as before, with average load being kept very low because of the high cache hit rate.
However, under very heavy load, when the cached version of that page expires, there may be sufficient concurrency
in the server farm that multiple threads of execution will all attempt to regenerate the content of that page simultaneously, since none of them know that the others are doing the same at the same time. If sufficiently high load is present, this may by itself be enough to bring about congestion collapse of the system via exhausting shared resources, preventing the page from ever being completely re-rendered and re-cached, as every attempt to do so will time out, thus reducing the cache hit rate to zero, and keeping the system continuously in congestion collapse as it attempts to regenerate the resource forever as long as the load remains present.
Cascading failure
A cascading failure is a failure in a system of interconnected parts in which the failure of a part can trigger the failure of successive parts.- Cascading failure in power transmission :...
that can occur when massively parallel computing
Parallel computing
Parallel computing is a form of computation in which many calculations are carried out simultaneously, operating on the principle that large problems can often be divided into smaller ones, which are then solved concurrently . There are several different forms of parallel computing: bit-level,...
systems with caching mechanisms come under very high load. This behaviour is sometimes also called dog-piling.
To see how cache stampedes can occur, consider a web server
Web server
Web server can refer to either the hardware or the software that helps to deliver content that can be accessed through the Internet....
which uses memcached
Memcached
In computing, memcached is a general-purpose distributed memory caching system that was originally developed by Danga Interactive for LiveJournal, but is now used by many other sites. It is often used to speed up dynamic database-driven websites by caching data and objects in RAM to reduce the...
to cache rendered pages for some period of time, to ease the load on the system. Under particularly high load to a single URL, the system remains responsive so long as the resource remains cached, with requests simply being handled by accessing the cached copy, without carrying out the expensive rendering operation.
Under low load, cache misses will simply result in a single recalculation of the expensive rendering operation, and the system will then continue as before, with average load being kept very low because of the high cache hit rate.
However, under very heavy load, when the cached version of that page expires, there may be sufficient concurrency
Concurrency
Concurrency, concurrent, or concurrence may refer to:* Concurrence, a legal term referring to the need to prove both actus reus and mens rea...
in the server farm that multiple threads of execution will all attempt to regenerate the content of that page simultaneously, since none of them know that the others are doing the same at the same time. If sufficiently high load is present, this may by itself be enough to bring about congestion collapse of the system via exhausting shared resources, preventing the page from ever being completely re-rendered and re-cached, as every attempt to do so will time out, thus reducing the cache hit rate to zero, and keeping the system continuously in congestion collapse as it attempts to regenerate the resource forever as long as the load remains present.