C-slowing
Encyclopedia
C-slowing is a technique used in conjunction with retiming to improve throughput
of a digital circuit
. Each register
in a circuit is replaced by a set of C registers (in series). This creates a circuit with C independent threads, as if the new circuit contained C copies of the original circuit. A single computation of the original circuit takes C times as many clock cycles to compute in the new circuit. C-slowing by itself increases latency
, but throughput
remains the same.
Increasing the number of registers allows optimization of the circuit through retiming
to reduce the clock period of the circuit. In the best case, the clock period can be reduced by a factor of C. Reducing the clock period of the circuit reduces latency and increases throughput. Thus, for computations that can be multi-threaded, combining C-slowing with retiming can increase the throughput of the circuit, with little, or in the best case, no increase in latency.
Since registers are relatively plentiful in FPGA
s, this technique is typically applied to circuits implemented with FPGAs.
Throughput
In communication networks, such as Ethernet or packet radio, throughput or network throughput is the average rate of successful message delivery over a communication channel. This data may be delivered over a physical or logical link, or pass through a certain network node...
of a digital circuit
Digital circuit
Digital electronics represent signals by discrete bands of analog levels, rather than by a continuous range. All levels within a band represent the same signal state...
. Each register
Hardware register
In digital electronics, especially computing, a hardware register stores bits of information, in a way that all the bits can be written to or read out simultaneously.The hardware registers inside a central processing unit are called processor registers....
in a circuit is replaced by a set of C registers (in series). This creates a circuit with C independent threads, as if the new circuit contained C copies of the original circuit. A single computation of the original circuit takes C times as many clock cycles to compute in the new circuit. C-slowing by itself increases latency
Latency (engineering)
Latency is a measure of time delay experienced in a system, the precise definition of which depends on the system and the time being measured. Latencies may have different meaning in different contexts.-Packet-switched networks:...
, but throughput
Throughput
In communication networks, such as Ethernet or packet radio, throughput or network throughput is the average rate of successful message delivery over a communication channel. This data may be delivered over a physical or logical link, or pass through a certain network node...
remains the same.
Increasing the number of registers allows optimization of the circuit through retiming
Retiming
Retiming is the technique of moving the structural location of latches or registers in a digital circuit to improve its performance, area, and/or power characteristics in such a way that preserves its functional behavior at its outputs. Retiming was first described by Charles E. Leiserson and...
to reduce the clock period of the circuit. In the best case, the clock period can be reduced by a factor of C. Reducing the clock period of the circuit reduces latency and increases throughput. Thus, for computations that can be multi-threaded, combining C-slowing with retiming can increase the throughput of the circuit, with little, or in the best case, no increase in latency.
Since registers are relatively plentiful in FPGA
Field-programmable gate array
A field-programmable gate array is an integrated circuit designed to be configured by the customer or designer after manufacturing—hence "field-programmable"...
s, this technique is typically applied to circuits implemented with FPGAs.
Resources
- PipeRoute: A Pipelining-Aware Router for Reconfigurable Architectures
- Simple Symmetric Multithreading in Xilinx FPGAs
- Post Placement C-Slow Retiming for Xilinx Virtex (.ppt)
- Post Placement C-Slow Retiming for Xilinx Virtex (.pdf)
- Exploration of RaPiD-style Pipelined FPGA Interconnects
- Time and Area Efficient Pattern Matching on FPGAs