IWarp
Encyclopedia
iWarp was an experimental parallel
supercomputer
architecture developed as a joint project by Intel and Carnegie Mellon University
. The project started in 1988, as a follow-up to CMU's previous WARP
research project, in order to explore building an entire parallel-computing "node" in a single microprocessor
, complete with memory and communications links. In this respect the iWarp is very similar to the INMOS transputer
and nCUBE
.
Intel announced iWarp in 1989. The first iWarp prototype was delivered to Carnegie Mellon in summer of 1990, and in fall they received the first 64-cell production systems, followed by two more in 1991. With the creation of the Intel Supercomputing Systems Division in the summer of 1992, the iWarp was merged into the iPSC
product line. Intel kept iWarp as a product but stopped actively marketing it.
Each iWarp CPU included a 32-bit
ALU
with a 64-bit
FPU
running at 20 MHz. It was purely scalar and completed one instruction per cycle, so the performance was 20 MIPS or 20 megaflops
for single precision and 10 MFLOPS for double. The communications were handled by a separate unit on the CPU that drove four serial
channels at 40 MB/s, and included networking support in hardware that allowed for up to 20 virtual channels (similar to the system added to the INMOS T9000).
iWarp processors were combined onto boards along with memory, but unlike other systems Intel chose the faster, but more expensive, Static RAM for use on the iWarp. Boards typically included four CPUs and anywhere from 512 kB to 4 MB of SRAM.
Another difference in the iWarp was that the systems were connected together as a n-by-m torus
, instead of the more common hypercube
. A typical system included 64 CPUs connected as an 8×8 torus, which could deliver 1.2 gigaflops peak.
George Cox was the lead architect of the iWarp project. Steven McGeady
(later an Intel Vice-President and witness in the Microsoft anti-trust case) wrote an innovative development environment that allowed software to be written for the array before it was completed. Each node of the array was represented by a different Sun workstation
on a LAN
, with the iWarp's unique inter-node communication protocol simulated over sockets
. Unlike the chip-level simulator, which could not simulate a multi-node array, and which ran very slowly, this environment allowed in-depth development of array software to begin.
The production compiler for iWarp was a C and Fortran compiler based on the AT&T
pcc compiler for UNIX, ported under contract for Intel and then extensively modified and extend by Intel.
Parallel computing
Parallel computing is a form of computation in which many calculations are carried out simultaneously, operating on the principle that large problems can often be divided into smaller ones, which are then solved concurrently . There are several different forms of parallel computing: bit-level,...
supercomputer
Supercomputer
A supercomputer is a computer at the frontline of current processing capacity, particularly speed of calculation.Supercomputers are used for highly calculation-intensive tasks such as problems including quantum physics, weather forecasting, climate research, molecular modeling A supercomputer is a...
architecture developed as a joint project by Intel and Carnegie Mellon University
Carnegie Mellon University
Carnegie Mellon University is a private research university in Pittsburgh, Pennsylvania, United States....
. The project started in 1988, as a follow-up to CMU's previous WARP
WARP (systolic array)
The Warp machines were a series of increasingly general-purpose systolic array processors, created by Carnegie Mellon University , in conjunction with industrial partners G.E., Honeywell and Intel, and funded by the U.S. Defense Advances Research Projects Agency .The Warp projects were started in...
research project, in order to explore building an entire parallel-computing "node" in a single microprocessor
Microprocessor
A microprocessor incorporates the functions of a computer's central processing unit on a single integrated circuit, or at most a few integrated circuits. It is a multipurpose, programmable device that accepts digital data as input, processes it according to instructions stored in its memory, and...
, complete with memory and communications links. In this respect the iWarp is very similar to the INMOS transputer
INMOS transputer
The transputer was a pioneering microprocessor architecture of the 1980s, featuring integrated memory and serial communication links, intended for parallel computing. It was designed and produced by Inmos, a British semiconductor company based in Bristol....
and nCUBE
NCUBE
nCUBE was a series of parallel computing computers from the company of the same name. Early generations of the hardware used a custom microprocessor...
.
Intel announced iWarp in 1989. The first iWarp prototype was delivered to Carnegie Mellon in summer of 1990, and in fall they received the first 64-cell production systems, followed by two more in 1991. With the creation of the Intel Supercomputing Systems Division in the summer of 1992, the iWarp was merged into the iPSC
Intel iPSC
The Intel iPSC is a parallel computer. It was superseded by the Intel iPSC/2. iPSC also more generally refers to the particular line of Intel parallel computers, which includes the iPSC/2 and the iPSC/860. Acronym "iPSC" means "Intel Personal SuperComputer"....
product line. Intel kept iWarp as a product but stopped actively marketing it.
Each iWarp CPU included a 32-bit
32-bit
The range of integer values that can be stored in 32 bits is 0 through 4,294,967,295. Hence, a processor with 32-bit memory addresses can directly access 4 GB of byte-addressable memory....
ALU
Arithmetic logic unit
In computing, an arithmetic logic unit is a digital circuit that performs arithmetic and logical operations.The ALU is a fundamental building block of the central processing unit of a computer, and even the simplest microprocessors contain one for purposes such as maintaining timers...
with a 64-bit
64-bit
64-bit is a word size that defines certain classes of computer architecture, buses, memory and CPUs, and by extension the software that runs on them. 64-bit CPUs have existed in supercomputers since the 1970s and in RISC-based workstations and servers since the early 1990s...
FPU
Floating point unit
A floating-point unit is a part of a computer system specially designed to carry out operations on floating point numbers. Typical operations are addition, subtraction, multiplication, division, and square root...
running at 20 MHz. It was purely scalar and completed one instruction per cycle, so the performance was 20 MIPS or 20 megaflops
FLOPS
In computing, FLOPS is a measure of a computer's performance, especially in fields of scientific calculations that make heavy use of floating-point calculations, similar to the older, simpler, instructions per second...
for single precision and 10 MFLOPS for double. The communications were handled by a separate unit on the CPU that drove four serial
Serial communications
In telecommunication and computer science, serial communication is the process of sending data one bit at a time, sequentially, over a communication channel or computer bus. This is in contrast to parallel communication, where several bits are sent as a whole, on a link with several parallel channels...
channels at 40 MB/s, and included networking support in hardware that allowed for up to 20 virtual channels (similar to the system added to the INMOS T9000).
iWarp processors were combined onto boards along with memory, but unlike other systems Intel chose the faster, but more expensive, Static RAM for use on the iWarp. Boards typically included four CPUs and anywhere from 512 kB to 4 MB of SRAM.
Another difference in the iWarp was that the systems were connected together as a n-by-m torus
Torus
In geometry, a torus is a surface of revolution generated by revolving a circle in three dimensional space about an axis coplanar with the circle...
, instead of the more common hypercube
Hypercube
In geometry, a hypercube is an n-dimensional analogue of a square and a cube . It is a closed, compact, convex figure whose 1-skeleton consists of groups of opposite parallel line segments aligned in each of the space's dimensions, perpendicular to each other and of the same length.An...
. A typical system included 64 CPUs connected as an 8×8 torus, which could deliver 1.2 gigaflops peak.
George Cox was the lead architect of the iWarp project. Steven McGeady
Steven McGeady
Steven McGeady is a former Intel executive best known as a witness in the Microsoft antitrust trial. His notes contained colorful quotes by Microsoft executives threatening to "cut off Netscape's air supply" and Bill Gates' guess that "this anti-trust thing will blow over"...
(later an Intel Vice-President and witness in the Microsoft anti-trust case) wrote an innovative development environment that allowed software to be written for the array before it was completed. Each node of the array was represented by a different Sun workstation
SUN workstation
The original SUN workstation was a modular computer system designed at Stanford University in the early 1980s.-History:The project name was derived from Stanford University Network, the campus network within Stanford....
on a LAN
Län
Län and lääni refer to the administrative divisions used in Sweden and previously in Finland. The provinces of Finland were abolished on January 1, 2010....
, with the iWarp's unique inter-node communication protocol simulated over sockets
Internet socket
In computer networking, an Internet socket or network socket is an endpoint of a bidirectional inter-process communication flow across an Internet Protocol-based computer network, such as the Internet....
. Unlike the chip-level simulator, which could not simulate a multi-node array, and which ran very slowly, this environment allowed in-depth development of array software to begin.
The production compiler for iWarp was a C and Fortran compiler based on the AT&T
AT&T
AT&T Inc. is an American multinational telecommunications corporation headquartered in Whitacre Tower, Dallas, Texas, United States. It is the largest provider of mobile telephony and fixed telephony in the United States, and is also a provider of broadband and subscription television services...
pcc compiler for UNIX, ported under contract for Intel and then extensively modified and extend by Intel.