Cilk
Encyclopedia
Cilk is a general-purpose
programming language
designed for multithreaded
parallel computing
. The commercial instantiation is Intel Cilk Plus
.
, to decide during execution how to actually divide the work between processors. It is because these responsibilities are separated that a Cilk program can run without rewriting on any number of processors, including one.
The Cilk language has been developed since 1994 at the MIT
Laboratory for Computer Science. It is based on ANSI C
, with the addition of just a handful of Cilk-specific keywords. When the Cilk keywords are removed from Cilk source code, the result is a valid C
program, called the serial elision (or C elision) of the full Cilk program. Cilk is a faithful extension of C and the serial elision of any Cilk program is always a valid serial implementation in C of the semantics of the parallel Cilk program. Despite several similarities, Cilk is not directly related to AT&T Bell Labs' Concurrent C.
A commercial version of Cilk, called Cilk++, that supports both C and C++ and is compatible with both GCC
and Microsoft
C++ compilers, was developed by Cilk Arts, Inc.. Academic and Open Source versions also exist, where the Open Source version is under an in-house license that falls somewhere between the updated BSD license and the LGPL. The original Cilk code is still available from MIT. In July 2009, Intel Corporation
acquired Cilk Arts, the Cilk++ technology and the Cilk trademark. In 2010, Intel released a commercial implementation in its compilers combined with some data parallel constructs with the name Intel Cilk Plus. Intel has also released a specification to enable other compatible implementations, and has said the trademark will be usable by compliant implementations.
In the original MIT Cilk implementation, the first Cilk keyword is in fact cilk, which identifies a function which is written in Cilk. Since Cilk procedures can call C procedures directly, but C procedures cannot directly call or spawn
Cilk procedures, this keyword is needed to distinguish Cilk code from C code.
The remaining keywords are:
They are described in further detail below.
spawn -- this keyword indicates that the procedure call it modifies can safely operate in parallel with other executing code. Note that the scheduler is not obligated to run this procedure in parallel; the keyword merely alerts the scheduler that it can do so.
sync -- this keyword indicates that execution of the current procedure cannot proceed until all previously spawned procedures have completed and returned their results to the parent frame. This is an example of a barrier
method.
implementation of the Fibonacci
function in Cilk, with parallel recursive calls, which demonstrates the cilk, spawn, and sync keywords. (Cilk program code is not numbered; the numbers have been added only to make the discussion easier to follow.)
If this code was executed by a single processor to determine the value of fib(2), that processor would create a frame for fib(2), and execute lines 01 through 05. On line 06, it would create spaces in the frame to hold the values of x and y. On line 08, the processor would have to suspend the current frame, create a new frame to execute the procedure fib(1), execute the code of that frame until reaching a return statement, and then resume the fib(2) frame with the value of fib(1) placed into fib(2)'s x variable. On the next line, it would need to suspend again to execute fib(0) and place the result in fib(2)'s y variable.
When the code is executed on a multiprocessor machine, however, execution proceeds differently. One processor starts the execution of fib(2); when it reaches line 08, however, the spawn keyword modifying the call to fib(n-1) tells the processor that it can safely give the job to a second processor: this second processor can create a frame for fib(1), execute its code, and store its result in fib(2)'s frame when it finishes; the first processor continues executing the code of fib(2) at the same time. A processor is not obligated to assign a spawned procedure elsewhere; if the machine only has two processors and the second is still busy on fib(1) when the processor executing fib(2) gets to the procedure call, the first processor will suspend fib(2) and execute fib(0) itself, as it would if it were the only processor. Of course, if another processor is available, then it will be called into service, and all three processors would be executing separate frames simultaneously.
(The preceding description is not entirely accurate. Even though the common terminology for discussing Cilk refers to processors making the decision to spawn off work to other processors, it is actually the scheduler which assigns procedures to processors for execution, using a policy called work-stealing, described later.)
If the processor executing fib(2) were to execute line 13 before both of the other processors had completed their frames, it would generate an incorrect result or an error; fib(2) would be trying to add the values stored in x and y, but one or both of those values would be missing. This is the purpose of the sync keyword, which we see in line 11: it tells the processor executing a frame that it must suspend its own execution, until all the procedure calls it has spawned off have returned. When fib(2) is allowed to proceed past the sync statement in line 11, it can only be because fib(1) and fib(0) have completed and placed their results in x and y, making it safe to perform calculations on those results.
The alternative is to use an inlet. An inlet is a function internal to a Cilk procedure which handles the results of a spawned procedure call as they return. One major reason to use inlets is that all the inlets of a procedure are guaranteed to operate atomically with regards to each other and to the parent procedure, thus avoiding the bugs that could occur if the multiple returning procedures tried to update the same variables in the parent frame at the same time.
inlet -- This keyword identifies a function defined within the procedure as an inlet.
abort -- This keyword can only be used inside an inlet; it tells the scheduler that any other procedures that have been spawned off by the parent procedure can safely be aborted.
The processor maintains a stack
on which it places each frame that it has to suspend in order to handle a procedure call. If it is executing fib(2), and encounters a recursive call to fib(1), it will save fib(2)'s state, including its variables and where the code suspended execution, and put that state on the stack. It will not take a suspended state off the stack and resume execution until the procedure call that caused the suspension, and any procedures called in turn by that procedure, have all been fully executed.
With multiple processors, things of course change. Each processor still has a stack for storing frames whose execution has been suspended; however, these stacks are more like deque
s, in that suspended states can be removed from either end. A processor can still only remove states from its own stack from the same end that it puts them on; however, any processor which is not currently working (having finished its own work, or not yet having been assigned any) will pick another processor at random, through the scheduler, and try to "steal" work from the opposite end of their stack—suspended states, which the stealing processor can then begin to execute. The states which get stolen are the states that the processor stolen from would get around to executing last.
On July 31, 2009, Cilk Arts announced on its web site that its products and engineering team were now part of Intel Corp. Intel and Cilk Arts integrated and advanced the technology further resulting in a September 2010 release of Intel Cilk Plus
. Intel Cilk Plus adopts simplifications, proposed by Cilk Arts in Cilk++, to eliminate the need for several of the original Cilk keywords while adding the ability to spawn functions and to deal with variables involved in reduction operations. Intel Cilk Plus differs from Cilk and Cilk++ by adding array extensions, being incorporated in a commercial compiler (from Intel), and compatibility with existing debuggers.
Intel has stated its desire to refine Cilk Plus and to enable it to be implemented by other compilers to gain industry wide adoption. Work has started on a gcc implementation based in part on Intel's run time code contributed to open source by Intel which includes code they acquired from Cilk Arts plus additions by Intel and former Cilk Arts employees.
4.7. The runtime is available dual-licenced, including BSD-3.
General-purpose programming language
In computer software a general-purpose programming language is a programming language designed to be used for writing software in a wide variety of application domains...
programming language
Programming language
A programming language is an artificial language designed to communicate instructions to a machine, particularly a computer. Programming languages can be used to create programs that control the behavior of a machine and/or to express algorithms precisely....
designed for multithreaded
Thread (computer science)
In computer science, a thread of execution is the smallest unit of processing that can be scheduled by an operating system. The implementation of threads and processes differs from one operating system to another, but in most cases, a thread is contained inside a process...
parallel computing
Parallel computing
Parallel computing is a form of computation in which many calculations are carried out simultaneously, operating on the principle that large problems can often be divided into smaller ones, which are then solved concurrently . There are several different forms of parallel computing: bit-level,...
. The commercial instantiation is Intel Cilk Plus
Intel Cilk Plus
Cilk Plus is an extension to the C and C++ programming languages, designed for multithreaded parallel computing.On July 31, 2009, Cilk Arts, producers of the Cilk++ programming language, announced that its products and engineering team were now part of Intel Corp...
.
Design
The biggest principle behind the design of the Cilk language is that the programmer should be responsible for exposing the parallelism, identifying elements that can safely be executed in parallel; it should then be left to the run-time environment, particularly the schedulerScheduling (computing)
In computer science, a scheduling is the method by which threads, processes or data flows are given access to system resources . This is usually done to load balance a system effectively or achieve a target quality of service...
, to decide during execution how to actually divide the work between processors. It is because these responsibilities are separated that a Cilk program can run without rewriting on any number of processors, including one.
The Cilk language has been developed since 1994 at the MIT
Massachusetts Institute of Technology
The Massachusetts Institute of Technology is a private research university located in Cambridge, Massachusetts. MIT has five schools and one college, containing a total of 32 academic departments, with a strong emphasis on scientific and technological education and research.Founded in 1861 in...
Laboratory for Computer Science. It is based on ANSI C
ANSI C
ANSI C refers to the family of successive standards published by the American National Standards Institute for the C programming language. Software developers writing in C are encouraged to conform to the standards, as doing so aids portability between compilers.-History and outlook:The first...
, with the addition of just a handful of Cilk-specific keywords. When the Cilk keywords are removed from Cilk source code, the result is a valid C
C (programming language)
C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....
program, called the serial elision (or C elision) of the full Cilk program. Cilk is a faithful extension of C and the serial elision of any Cilk program is always a valid serial implementation in C of the semantics of the parallel Cilk program. Despite several similarities, Cilk is not directly related to AT&T Bell Labs' Concurrent C.
A commercial version of Cilk, called Cilk++, that supports both C and C++ and is compatible with both GCC
GNU Compiler Collection
The GNU Compiler Collection is a compiler system produced by the GNU Project supporting various programming languages. GCC is a key component of the GNU toolchain...
and Microsoft
Microsoft
Microsoft Corporation is an American public multinational corporation headquartered in Redmond, Washington, USA that develops, manufactures, licenses, and supports a wide range of products and services predominantly related to computing through its various product divisions...
C++ compilers, was developed by Cilk Arts, Inc.. Academic and Open Source versions also exist, where the Open Source version is under an in-house license that falls somewhere between the updated BSD license and the LGPL. The original Cilk code is still available from MIT. In July 2009, Intel Corporation
Intel Corporation
Intel Corporation is an American multinational semiconductor chip maker corporation headquartered in Santa Clara, California, United States and the world's largest semiconductor chip maker, based on revenue. It is the inventor of the x86 series of microprocessors, the processors found in most...
acquired Cilk Arts, the Cilk++ technology and the Cilk trademark. In 2010, Intel released a commercial implementation in its compilers combined with some data parallel constructs with the name Intel Cilk Plus. Intel has also released a specification to enable other compatible implementations, and has said the trademark will be usable by compliant implementations.
In the original MIT Cilk implementation, the first Cilk keyword is in fact cilk, which identifies a function which is written in Cilk. Since Cilk procedures can call C procedures directly, but C procedures cannot directly call or spawn
Spawn (computing)
Spawn in computing refers to a function that loads and executes a new child process.The current process may or may not continue to execute asynchronously...
Cilk procedures, this keyword is needed to distinguish Cilk code from C code.
The remaining keywords are:
- spawn
- sync
- inlet
- abort
They are described in further detail below.
Basic parallelism with Cilk
Two keywords are all that are needed to start using the parallel features of Cilk:spawn -- this keyword indicates that the procedure call it modifies can safely operate in parallel with other executing code. Note that the scheduler is not obligated to run this procedure in parallel; the keyword merely alerts the scheduler that it can do so.
sync -- this keyword indicates that execution of the current procedure cannot proceed until all previously spawned procedures have completed and returned their results to the parent frame. This is an example of a barrier
Barrier (computer science)
- Threads synchronization primitive :In parallel computing, a barrier is a type of synchronization method. A barrier for a group of threads or processes in the source code means any thread/process must stop at this point and cannot proceed until all other threads/processes reach this barrier.Many...
method.
Sample code
Below is a recursiveRecursion
Recursion is the process of repeating items in a self-similar way. For instance, when the surfaces of two mirrors are exactly parallel with each other the nested images that occur are a form of infinite recursion. The term has a variety of meanings specific to a variety of disciplines ranging from...
implementation of the Fibonacci
Fibonacci number
In mathematics, the Fibonacci numbers are the numbers in the following integer sequence:0,\;1,\;1,\;2,\;3,\;5,\;8,\;13,\;21,\;34,\;55,\;89,\;144,\; \ldots\; ....
function in Cilk, with parallel recursive calls, which demonstrates the cilk, spawn, and sync keywords. (Cilk program code is not numbered; the numbers have been added only to make the discussion easier to follow.)
01 cilk int fib (int n)
02 {
03 if (n < 2) return n;
04 else
05 {
06 int x, y;
07
08 x = spawn fib (n-1);
09 y = spawn fib (n-2);
10
11 sync;
12
13 return (x+y);
14 }
15 }
If this code was executed by a single processor to determine the value of fib(2), that processor would create a frame for fib(2), and execute lines 01 through 05. On line 06, it would create spaces in the frame to hold the values of x and y. On line 08, the processor would have to suspend the current frame, create a new frame to execute the procedure fib(1), execute the code of that frame until reaching a return statement, and then resume the fib(2) frame with the value of fib(1) placed into fib(2)
When the code is executed on a multiprocessor machine, however, execution proceeds differently. One processor starts the execution of fib(2); when it reaches line 08, however, the spawn keyword modifying the call to fib(n-1) tells the processor that it can safely give the job to a second processor: this second processor can create a frame for fib(1), execute its code, and store its result in fib(2)
(The preceding description is not entirely accurate. Even though the common terminology for discussing Cilk refers to processors making the decision to spawn off work to other processors, it is actually the scheduler which assigns procedures to processors for execution, using a policy called work-stealing, described later.)
If the processor executing fib(2) were to execute line 13 before both of the other processors had completed their frames, it would generate an incorrect result or an error; fib(2) would be trying to add the values stored in x and y, but one or both of those values would be missing. This is the purpose of the sync keyword, which we see in line 11: it tells the processor executing a frame that it must suspend its own execution, until all the procedure calls it has spawned off have returned. When fib(2) is allowed to proceed past the sync statement in line 11, it can only be because fib(1) and fib(0) have completed and placed their results in x and y, making it safe to perform calculations on those results.
Advanced parallelism with Cilk: Inlets
The two remaining Cilk keywords are slightly more advanced, and concern the use of inlets. Ordinarily, when a Cilk procedure is spawned, it can return its results to the parent procedure only by putting those results in a variable in the parent's frame, as we assigned the results of our spawned procedure calls in the example tox
and y
.The alternative is to use an inlet. An inlet is a function internal to a Cilk procedure which handles the results of a spawned procedure call as they return. One major reason to use inlets is that all the inlets of a procedure are guaranteed to operate atomically with regards to each other and to the parent procedure, thus avoiding the bugs that could occur if the multiple returning procedures tried to update the same variables in the parent frame at the same time.
inlet -- This keyword identifies a function defined within the procedure as an inlet.
abort -- This keyword can only be used inside an inlet; it tells the scheduler that any other procedures that have been spawned off by the parent procedure can safely be aborted.
Work-stealing
The Cilk scheduler uses a policy called "work-stealing" to divide procedure execution efficiently among multiple processors. Again, it is easiest to understand if we look first at how Cilk code is executed on a single-processor machine.The processor maintains a stack
Call stack
In computer science, a call stack is a stack data structure that stores information about the active subroutines of a computer program. This kind of stack is also known as an execution stack, control stack, run-time stack, or machine stack, and is often shortened to just "the stack"...
on which it places each frame that it has to suspend in order to handle a procedure call. If it is executing fib(2), and encounters a recursive call to fib(1), it will save fib(2)
With multiple processors, things of course change. Each processor still has a stack for storing frames whose execution has been suspended; however, these stacks are more like deque
Deque
In computer science, a double-ended queue is an abstract data structure that implements a queue for which elements can only be added to or removed from the front or back...
s, in that suspended states can be removed from either end. A processor can still only remove states from its own stack from the same end that it puts them on; however, any processor which is not currently working (having finished its own work, or not yet having been assigned any) will pick another processor at random, through the scheduler, and try to "steal" work from the opposite end of their stack—suspended states, which the stealing processor can then begin to execute. The states which get stolen are the states that the processor stolen from would get around to executing last.
Commercialization
Prior to ~2006, the market for Cilk was restricted to high-performance computing. The emergence of multicore processors in mainstream computing means that hundreds of millions of new parallel computers are now being shipped every year. Cilk Arts was formed to capitalize on that opportunity: In 2006, Professor Leiserson launched Cilk Arts to create and bring to market a modern version of Cilk that supports the commercial needs of an upcoming generation of programmers. The company closed a Series A venture financing round in October 2007, and Cilk++ 1.0 shipped in December, 2008. Cilk++ differs from Cilk in several ways: support for C++, operation with both Microsoft and GCC compilers, support for loops, and "Cilk hyperobjects" - a new construct designed to solve data race problems created by parallel accesses to global variables.On July 31, 2009, Cilk Arts announced on its web site that its products and engineering team were now part of Intel Corp. Intel and Cilk Arts integrated and advanced the technology further resulting in a September 2010 release of Intel Cilk Plus
Intel Cilk Plus
Cilk Plus is an extension to the C and C++ programming languages, designed for multithreaded parallel computing.On July 31, 2009, Cilk Arts, producers of the Cilk++ programming language, announced that its products and engineering team were now part of Intel Corp...
. Intel Cilk Plus adopts simplifications, proposed by Cilk Arts in Cilk++, to eliminate the need for several of the original Cilk keywords while adding the ability to spawn functions and to deal with variables involved in reduction operations. Intel Cilk Plus differs from Cilk and Cilk++ by adding array extensions, being incorporated in a commercial compiler (from Intel), and compatibility with existing debuggers.
Intel has stated its desire to refine Cilk Plus and to enable it to be implemented by other compilers to gain industry wide adoption. Work has started on a gcc implementation based in part on Intel's run time code contributed to open source by Intel which includes code they acquired from Cilk Arts plus additions by Intel and former Cilk Arts employees.
Re-licencing
Intel has announced that it is maintaining Cilk Plus as a branch of GCCGNU Compiler Collection
The GNU Compiler Collection is a compiler system produced by the GNU Project supporting various programming languages. GCC is a key component of the GNU toolchain...
4.7. The runtime is available dual-licenced, including BSD-3.
External links
- Cilk Project website at MIT
- Cilk Arts, Inc.
- Cilk++ Technology Overview
- Cilk++ Product Demonstration
- Multicore Programming blog
- e-Book on Multicore Programming e-Book outlining multicore programming challenges, and the leading programming approaches to deal with them.
- Whitepaper on Multicore Programming in the Java Virtual Machine (JVM)