Barrel processor - AbsoluteAstronomy.com

A barrel processor is a CPU

Central processing unit

The central processing unit is the portion of a computer system that carries out the instructions of a computer program, to perform the basic arithmetical, logical, and input/output operations of the system. The CPU plays a role somewhat analogous to the brain in the computer. The term has been in...

that switches between threads

Thread (computer science)

In computer science, a thread of execution is the smallest unit of processing that can be scheduled by an operating system. The implementation of threads and processes differs from one operating system to another, but in most cases, a thread is contained inside a process...

of execution on every cycle

Instruction cycle

An instruction cycle is the basic operation cycle of a computer. It is the process by which a computer retrieves a program instruction from its memory, determines what actions the instruction requires, and carries out those actions...

. This CPU design

CPU design

CPU design is the design engineering task of creating a central processing unit , a component of computer hardware. It is a subfield of electronics engineering and computer engineering.- Overview :CPU design focuses on these areas:...

technique is also known as "interleaved" or "fine-grained" temporal multithreading

Temporal multithreading

Temporal multithreading is one of the two main forms of multithreading that can be implemented on computer processor hardware, the other being simultaneous multithreading. The distinguishing difference between the two forms is the maximum number of concurrent threads that can execute in any given...

. As opposed to simultaneous multithreading

Simultaneous multithreading

Simultaneous multithreading, often abbreviated as SMT, is a technique for improving the overall efficiency of superscalar CPUs with hardware multithreading...

in modern superscalar

Superscalar

A superscalar CPU architecture implements a form of parallelism called instruction level parallelism within a single processor. It therefore allows faster CPU throughput than would otherwise be possible at a given clock rate...

architectures, it generally does not allow execution of multiple instructions in one cycle.

For example, the peripheral processing system of the CDC 6000 series

CDC 6000 series

The CDC 6000 series was a family of mainframe computers manufactured by Control Data Corporation in the 1960s. It consisted of CDC 6400, CDC 6500, CDC 6600 and CDC 6700 computers, which all were extremely rapid and efficient for their time...

computers and its successors executed one instruction (or a portion of an instruction) from each of 10 different virtual processors (called peripheral processors) before returning to the first processor. Also, the IP3023 processor from Ubicom executes one instruction from each of 8 different threads before returning to the first thread. The Cray XMT

Cray XMT

The Cray XMT is the third generation of the Cray MTA supercomputer architecture originally developed by Tera. The earlier generations were called the Cray MTA and the Cray MTA-2. The XMT makes the MTA's multithreaded processors, now dubbed Threadstorm, compatible with the 1207-pin Socket F used...

also uses a barrel processor (Threadstorm) in its architecture.

Like preemptive multitasking, each thread of execution is assigned its own program counter

Program counter

The program counter , commonly called the instruction pointer in Intel x86 microprocessors, and sometimes called the instruction address register, or just part of the instruction sequencer in some computers, is a processor register that indicates where the computer is in its instruction sequence...

and other hardware register

Hardware register

In digital electronics, especially computing, a hardware register stores bits of information, in a way that all the bits can be written to or read out simultaneously.The hardware registers inside a central processing unit are called processor registers....

s (each thread's architectural state

Architectural state

The architectural state is the part of the CPU which holds the state ofa process, this includes:* Control registers** Instruction Flag Registers ** Interrupt Mask Registers** Memory management unit Registers** Status registers...

). A barrel processor can guarantee that each thread will execute 1 instruction every N cycles, unlike a preemptive multitasking machine, that typically runs one thread of execution for hundreds or thousands of cycles, while all other threads wait their turn.

A technique called C-slowing

C-slowing

C-slowing is a technique used in conjunction with retiming to improve throughput of a digital circuit. Each register in a circuit is replaced by a set of C registers . This creates a circuit with C independent threads, as if the new circuit contained C copies of the original circuit...

can take a normal single-tasking processor design and automatically generate a corresponding barrel processor design. An n-way barrel processor generated this way acts much like n separate multiprocessing

Multiprocessing

Multiprocessing is the use of two or more central processing units within a single computer system. The term also refers to the ability of a system to support more than one processor and/or the ability to allocate tasks between them...

copies of the original single-tasking processor, each one running at roughly 1/n the original speed.

Advantages compared to single threaded processors

A single-tasking processor spends a lot of time idle, not doing anything useful whenever a cache miss or pipeline stall occurs. Advantages to employing barrel processors over single-tasking processors include:

The ability to do useful work on the other threads while the stalled thread is waiting.
Designing an n-way barrel processor with n-deep pipeline
Instruction pipeline
An instruction pipeline is a technique used in the design of computers and other digital electronic devices to increase their instruction throughput ....

s is much simpler than designing a single-tasking processor because a barrel processor never has a pipeline stall and doesn't need feed-forward
Feed-forward
Feed-forward is a term describing an element or pathway within a control system which passes a controlling signal from a source in the control system's external environment, often a command signal from an external operator, to a load elsewhere in its external environment...

circuits.
For real-time
Real-time computing
In computer science, real-time computing , or reactive computing, is the study of hardware and software systems that are subject to a "real-time constraint"— e.g. operational deadlines from event to system response. Real-time programs must guarantee response within strict time constraints...

applications, a barrel processor can guarantee that a "real-time" thread can execute with precise timing, no matter what happens to the other threads—even if some other thread locks up
Deadlock
A deadlock is a situation where in two or more competing actions are each waiting for the other to finish, and thus neither ever does. It is often seen in a paradox like the "chicken or the egg"...

in an infinite loop
Infinite loop
An infinite loop is a sequence of instructions in a computer program which loops endlessly, either due to the loop having no terminating condition, having one that can never be met, or one that causes the loop to start over...

or is continuously interrupted by hardware interrupts.

Disadvantages compared to single threaded processors

There are, however, some disadvantages to barrel processors.

Either all threads must share the same cache
CPU cache
A CPU cache is a cache used by the central processing unit of a computer to reduce the average time to access memory. The cache is a smaller, faster memory which stores copies of the data from the most frequently used main memory locations...

, which slows overall system performance, or there must be one unit of cache for each execution thread, which can significantly increase the transistor count
Transistor count
The transistor count of a device is the number of transistors in the device.Transistor count is the most common measure of integrated circuit complexity. According to Moore's Law, the transistor count of the integrated circuits doubles every two years...

(and thus cost) of such a CPU. However, most barrel processors are used to implement hard real-time embedded system
Embedded system
An embedded system is a computer system designed for specific control functions within a larger system. often with real-time computing constraints. It is embedded as part of a complete device often including hardware and mechanical parts. By contrast, a general-purpose computer, such as a personal...

s, where memory access costs are typically calculated assuming worst-case behavior of the cache, so this is less of a concern.
The state of each thread must be kept on-chip (typically in registers) to avoid costly off-chip context switches. This requires a large number of registers compared to typical processors.

External links

Soft peripherals Embedded.com article examines Ubicom's IP3023 processor
An Evaluation of the Design of the Gamma 60

The source of this article is wikipedia, the free encyclopedia. The text of this article is licensed under the GFDL.

Advantages compared to single threaded processors

Disadvantages compared to single threaded processors

See also

External links