Circular buffer
Encyclopedia
A circular buffer, cyclic buffer or ring buffer is a data structure
that uses a single, fixed-size buffer
as if it were connected end-to-end.
This structure lends itself easily to buffering data stream
s.
If the buffer is used as the bounded buffer in the producer-consumer problem
then it is probably desired for the producer (e.g., an audio generator) to overwrite old data if the consumer (e.g., the sound card
) is unable to momentarily keep up. Another example is the digital waveguide synthesis
method which uses circular buffers to efficiently simulate the sound of vibrating strings or wind instruments.
The "prized" attribute of a circular buffer is that it does not need to have its elements shuffled around when one is consumed.
(If a non-circular buffer were used then it would be necessary to shift all elements when one is consumed.)
In other words, the circular buffer is well suited as a FIFO buffer while a standard, non-circular buffer is well suited as a LIFO buffer.
Circular buffering makes a good implementation strategy for a Queue that has fixed maximum size. Should a maximum size be adopted for a queue, then a circular buffer is a completely ideal implementation; all queue operations are constant time. However, expanding a circular buffer requires shifting memory, which is comparatively costly. For arbitrarily expanding queues, a Linked list
approach may be preferred instead.
For example, this is a 7-element buffer:
Assume that a 1 is written into the middle of the buffer (exact starting location does not matter in a circular buffer):
Then assume that two more elements are added — 2 & 3 — which get appended after the 1:
If two elements are then removed from the buffer, the oldest values inside the buffer are removed.
The two elements removed, in this case, are 1 & 2 leaving the buffer with just a 3:
If the buffer has 7 elements then it is completely full:
A consequence of the circular buffer is that when it is full and a subsequent write is performed, then it starts overwriting the oldest data.
In this case, two more elements — A & B — are added and they overwrite the 3 & 4:
Alternatively, the routines that manage the buffer could prevent overwriting the data and return an error or raise an exception
.
Whether or not data is overwritten is up to the semantics of the buffer routines or the application using the circular buffer.
Finally, if two elements are now removed then what would be returned is not 3 & 4 but 5 & 6 because A & B overwrote the 3 & the 4 yielding the buffer with:
Alternatively, a fixed-length buffer with two integers to keep track of indices can be used in languages that do not have pointers.
Taking a couple of examples from above.
(While there are numerous ways to label the pointers and exact semantics can vary, this is one way to do it.)
This image shows a partially-full buffer:
This image shows a full buffer with two elements having been overwritten:
What to note about the second one is that after each element is overwritten then the start pointer is incremented as well.
This is exactly the same situation as when the buffer is empty:
To solve this confusion there are a number of solutions:
If both pointers are pointing at the same location, the buffer is empty. If the end (write) pointer, plus one,
equals the start (read) pointer, then the buffer is full.
The advantages are:
The disadvantages are:
An example implementation in C: (Keep One Slot Open)
An example implementation in C: (Use all slots) (but is dangerous - an attempt to insert items on a full queue will yield success, but will, in fact, overwrite the queue)
In the situation if both pointers pointing at the same location, you consider the fill count to distinguish if the buffer is empty or full.
The advantages are:
The disadvantage is:
Alternately, you can replace the second pointer with the fill count and generate the second pointer as required by incrementing the first pointer by the fill count, modulo buffer size.
The advantages are:
The disadvantage is:
The unsigned difference (write_count - read_count) always yields the number of items placed in the buffer and not yet retrieved. This can indicate that the buffer is empty, partially full, completely full (without waste of a storage location) or in a state of overrun.
The advantage is:
The disadvantage is:
The advantages are:
The disadvantage is:
the buffer's length.
The advantage is:
The disadvantages are:
On binary computers, both of these disadvantages disappear if the buffer's length is a power of two—at the cost of a constraint on possible buffers lengths.
Multiple Read Pointers
Chunked Buffer
Much more complex are different chunks of data in the same circular buffer. The writer is not only writing elements to the buffer, it also assigns these elements to chunks .
The reader should not only be able to read from the buffer, it should also get informed about the chunk borders.
Example: The writer is reading data from small files, writing them into the same circular buffer. The reader is reading the data, but needs to know when and which file is starting at a given position.
Optimization
A circular-buffer implementation may be optimized by mapping
the underlying buffer to two contiguous regions of virtual memory
. (Naturally, the underlying buffer‘s length must then equal some multiple of the system’s page size
.) Reading from and writing to the circular buffer may then be carried out with greater efficiency by means of direct memory access; those accesses which fall beyond the end of the first virtual-memory region will automatically wrap around to the beginning of the underlying buffer. When the read offset is advanced into the second virtual-memory region, both offsets—read and write—are decremented by the length of the underlying buffer.
Optimized POSIX Implementation
Data structure
In computer science, a data structure is a particular way of storing and organizing data in a computer so that it can be used efficiently.Different kinds of data structures are suited to different kinds of applications, and some are highly specialized to specific tasks...
that uses a single, fixed-size buffer
Buffer (computer science)
In computer science, a buffer is a region of a physical memory storage used to temporarily hold data while it is being moved from one place to another. Typically, the data is stored in a buffer as it is retrieved from an input device or just before it is sent to an output device...
as if it were connected end-to-end.
This structure lends itself easily to buffering data stream
Data stream
In telecommunications and computing, a data stream is a sequence of digitally encoded coherent signals used to transmit or receive information that is in the process of being transmitted....
s.
Uses
An example that could possibly use an overwriting circular buffer is with multimedia.If the buffer is used as the bounded buffer in the producer-consumer problem
Producer-consumer problem
In problem is a classical example of a multi-process synchronization problem. The problem describes two processes, the producer and the consumer, who share a common, fixed-size buffer used as a queue. The producer's job is to generate a piece of data, put it into the buffer and start again...
then it is probably desired for the producer (e.g., an audio generator) to overwrite old data if the consumer (e.g., the sound card
Sound card
A sound card is an internal computer expansion card that facilitates the input and output of audio signals to and from a computer under control of computer programs. The term sound card is also applied to external audio interfaces that use software to generate sound, as opposed to using hardware...
) is unable to momentarily keep up. Another example is the digital waveguide synthesis
Digital waveguide synthesis
Digital waveguide synthesis is the synthesis of audio using a digital waveguide. Digital waveguides are efficient computational models for physical media through which acoustic waves propagate...
method which uses circular buffers to efficiently simulate the sound of vibrating strings or wind instruments.
The "prized" attribute of a circular buffer is that it does not need to have its elements shuffled around when one is consumed.
(If a non-circular buffer were used then it would be necessary to shift all elements when one is consumed.)
In other words, the circular buffer is well suited as a FIFO buffer while a standard, non-circular buffer is well suited as a LIFO buffer.
Circular buffering makes a good implementation strategy for a Queue that has fixed maximum size. Should a maximum size be adopted for a queue, then a circular buffer is a completely ideal implementation; all queue operations are constant time. However, expanding a circular buffer requires shifting memory, which is comparatively costly. For arbitrarily expanding queues, a Linked list
Linked list
In computer science, a linked list is a data structure consisting of a group of nodes which together represent a sequence. Under the simplest form, each node is composed of a datum and a reference to the next node in the sequence; more complex variants add additional links...
approach may be preferred instead.
How it works
A circular buffer first starts empty and of some predefined length.For example, this is a 7-element buffer:
Assume that a 1 is written into the middle of the buffer (exact starting location does not matter in a circular buffer):
Then assume that two more elements are added — 2 & 3 — which get appended after the 1:
If two elements are then removed from the buffer, the oldest values inside the buffer are removed.
The two elements removed, in this case, are 1 & 2 leaving the buffer with just a 3:
If the buffer has 7 elements then it is completely full:
A consequence of the circular buffer is that when it is full and a subsequent write is performed, then it starts overwriting the oldest data.
In this case, two more elements — A & B — are added and they overwrite the 3 & 4:
Alternatively, the routines that manage the buffer could prevent overwriting the data and return an error or raise an exception
Exception handling
Exception handling is a programming language construct or computer hardware mechanism designed to handle the occurrence of exceptions, special conditions that change the normal flow of program execution....
.
Whether or not data is overwritten is up to the semantics of the buffer routines or the application using the circular buffer.
Finally, if two elements are now removed then what would be returned is not 3 & 4 but 5 & 6 because A & B overwrote the 3 & the 4 yielding the buffer with:
Circular buffer mechanics
What is not shown in the example above is the mechanics of how the circular buffer is managed.Start / End Pointers
Generally, a circular buffer requires three pointers:- one to the actual buffer in memory
- one to point to the start of valid data
- one to point to the end of valid data
Alternatively, a fixed-length buffer with two integers to keep track of indices can be used in languages that do not have pointers.
Taking a couple of examples from above.
(While there are numerous ways to label the pointers and exact semantics can vary, this is one way to do it.)
This image shows a partially-full buffer:
This image shows a full buffer with two elements having been overwritten:
What to note about the second one is that after each element is overwritten then the start pointer is incremented as well.
Full / Empty Buffer Distinction
A small disadvantage of relying on pointers or relative indices of the start and end of data is, that in the case the buffer is entirely full, both pointers point to the same element:This is exactly the same situation as when the buffer is empty:
To solve this confusion there are a number of solutions:
- Always keep one slot open.
- Use a fill count to distinguish the two cases.
- Use read and write counts to get the fill count from.
- Use absolute indices.
Always Keep One Slot Open
This simple solution always keeps one slot unallocated. A full buffer has at most slots.If both pointers are pointing at the same location, the buffer is empty. If the end (write) pointer, plus one,
equals the start (read) pointer, then the buffer is full.
The advantages are:
- Very simple and robust.
- You need only the two pointers.
The disadvantages are:
- You can never use the entire buffer.
- You might only be able to access one element at a time, since you won't easily know how many elements are next to each other in memory..
An example implementation in C: (Keep One Slot Open)
An example implementation in C: (Use all slots) (but is dangerous - an attempt to insert items on a full queue will yield success, but will, in fact, overwrite the queue)
Use a Fill Count
The second simplest solution is to use a fill count. The fill count is implemented as an additional variable which keeps the number of readable items in the buffer. This variable has to be increased if the write (end) pointer is moved, and to be decreased if the read (start) pointer is moved.In the situation if both pointers pointing at the same location, you consider the fill count to distinguish if the buffer is empty or full.
- Note: When using semaphoresSemaphore (programming)In computer science, a semaphore is a variable or abstract data type that provides a simple but useful abstraction for controlling access by multiple processes to a common resource in a parallel programming environment....
in a Producer-consumerProducer-consumer problemIn problem is a classical example of a multi-process synchronization problem. The problem describes two processes, the producer and the consumer, who share a common, fixed-size buffer used as a queue. The producer's job is to generate a piece of data, put it into the buffer and start again...
model, the semaphores act as a fill count.
The advantages are:
- Simple.
- Needs only one additional variable.
The disadvantage is:
- You need to keep track of a third variable. This can require complex logic, especially if you are working with different threadsThread (computer science)In computer science, a thread of execution is the smallest unit of processing that can be scheduled by an operating system. The implementation of threads and processes differs from one operating system to another, but in most cases, a thread is contained inside a process...
.
Alternately, you can replace the second pointer with the fill count and generate the second pointer as required by incrementing the first pointer by the fill count, modulo buffer size.
The advantages are:
- Simple.
- No additional variables.
The disadvantage is:
- Additional overhead when generating the write pointer.
Read / Write Counts
Another solution is to keep counts of the number of items written to and read from the circular buffer. Both counts are stored in signed integer variables with numerical limits larger than the number of items that can be stored and are allowed to wrap freely.The unsigned difference (write_count - read_count) always yields the number of items placed in the buffer and not yet retrieved. This can indicate that the buffer is empty, partially full, completely full (without waste of a storage location) or in a state of overrun.
The advantage is:
- The source and sink of data can implement independent policies for dealing with a full buffer and overrun while adhering to the rule that only the source of data modifies the write count and only the sink of data modifies the read count. This can result in elegant and robust circular buffer implementations even in multi-threaded environments.
The disadvantage is:
- You need two additional variables.
Record last operation
Another solution is to keep a flag indicating whether the most recent operation was a read or a write. If the two pointers are equal, then the flag will show whether the buffer is full or empty: if the most recent operation was a write, the buffer must be full, and conversely if it was a read, it must be empty.The advantages are:
- Only a single bit needs to be stored (which may be particularly useful if the algorithm is implemented in hardware)
- The test for full/empty is simple
The disadvantage is:
- You need an extra variable
Absolute indices
If indices are used instead of pointers, indices can store read/write counts instead of the offset from start of the buffer. This is similar to the above solution, except that there are no separate variables, and relative indices are obtained on the fly by division moduloModulo operation
In computing, the modulo operation finds the remainder of division of one number by another.Given two positive numbers, and , a modulo n can be thought of as the remainder, on division of a by n...
the buffer's length.
The advantage is:
- No extra variables are needed.
The disadvantages are:
- Every access needs an additional modulo operation.
- If counter wrap is possible, complex logic can be needed if the buffer's length is not a divisor of the counter's capacity.
On binary computers, both of these disadvantages disappear if the buffer's length is a power of two—at the cost of a constraint on possible buffers lengths.
Multiple Read Pointers
A little bit more complex are multiple read pointers on the same circular buffer. This is useful if you have n threads, which are reading from the same buffer, but one thread writing to the buffer.
Chunked Buffer Much more complex are different chunks of data in the same circular buffer. The writer is not only writing elements to the buffer, it also assigns these elements to chunks .
The reader should not only be able to read from the buffer, it should also get informed about the chunk borders.
Example: The writer is reading data from small files, writing them into the same circular buffer. The reader is reading the data, but needs to know when and which file is starting at a given position.
Optimization
A circular-buffer implementation may be optimized by mapping
Mmap
In computing, mmap is a POSIX-compliant Unix system call that maps files or devices into memory. It is a method of memory-mapped file I/O. It naturally implements demand paging, because initially file contents are not entirely read from disk and do not use physical RAM at all...
the underlying buffer to two contiguous regions of virtual memory
Virtual memory
In computing, virtual memory is a memory management technique developed for multitasking kernels. This technique virtualizes a computer architecture's various forms of computer data storage , allowing a program to be designed as though there is only one kind of memory, "virtual" memory, which...
. (Naturally, the underlying buffer‘s length must then equal some multiple of the system’s page size
Page (computing)
A page, memory page, or virtual page is a fixed-length contiguous block of virtual memory that is the smallest unit of data for the following:* memory allocation performed by the operating system for a program; and...
.) Reading from and writing to the circular buffer may then be carried out with greater efficiency by means of direct memory access; those accesses which fall beyond the end of the first virtual-memory region will automatically wrap around to the beginning of the underlying buffer. When the read offset is advanced into the second virtual-memory region, both offsets—read and write—are decremented by the length of the underlying buffer.
Optimized POSIX Implementation
External links
- http://c2.com/cgi/wiki?CircularBuffer
- Boost: Templated Circular Buffer Container
- http://www.dspguide.com/ch28/2.htm