Icon programming language
Encyclopedia
Icon is a very high-level
programming language
featuring goal directed execution and many facilities for managing strings
and textual patterns. It is related to SNOBOL
and SL5, string processing languages. Icon is not object-oriented, but an object-oriented extension called Idol was developed in 1996 which eventually became Unicon
.
-class of structured programming
languages, and thus has syntax similar to C
or Pascal. Icon is most similar to Pascal, using
In many ways Icon also shares features with most scripting programming languages (as well as SNOBOL
and SL5, from which they were taken): variables do not have to be declared, types are cast automatically, and numbers can be converted to strings and back automatically. Another feature common to many scripting languages, but not all, is the lack of a line-ending character; in Icon, lines not ended by a semicolon get ended by an implied semicolon if it makes sense.
Procedures are the basic building blocks of Icon programs, and although they use Pascal naming they work more like C functions and can return values; there is no
The utility of this concept becomes much clearer when you consider real-world examples. Since Icon uses success or failure for all flow control, this simple code:
Will copy one line of the standard input to standard output. What's interesting about this example is that the code will work even if the read causes an error, for instance, if the file does not exist. In that case the statement
Success and failure are passed "up" through functions, meaning that a failure inside a nested function
will cause the functions calling it to fail as well. For instance, we can write a program to copy an entire input file to output in a single line:
When the read command fails, at the end of file for instance, the failure will be passed up the chain and write will fail as well. The while, being a control structure, stops on failure, meaning it stops when the file is empty. For comparison, consider a similar example written in Java
-based pseudocode
:
In this case there are two comparisons needed, one for end of file (EOF) and another for all other errors. Since Java does not allow errors to be compared as logic elements, as under Icon, the lengthy
Icon refers to this concept as goal-directed execution, referring to the way that execution continues until some goal is reached. In the example above the goal is to read the entire file; the read command continues to succeed while there is more information to be read, and fails when there isn't. The goal is thus coded directly in the language, instead of using statements checking return codes or similar constructs.
This is a key concept in Icon, known as generators
. Generators drive much of the loop functionality in the language, but do so more directly; the programmer does not write a loop and then pull out and compare values, Icon will do all of this for you.
Within the parlance of Icon, the evaluation of an expression or function results in a result sequence. A result sequence contains all the possible values that can be generated by the expression or function. When the result sequence is exhausted (e.g. there are no more values within the result sequence), the expression or function fails. Iteration over the result sequence is achieved either implicitly via Icon's goal directed evaluation or explicitly via the every clause.
Icon includes several generator-builders. The alternator syntax allows a series of items to be generated in sequence until one fails:
can generate "1", "hello", and "5" if x is less than 5. Alternators can be read as "or" in many cases, for instance:
will write out the value of y if it is smaller than x or 5. Internally Icon checks every value from left to right until one succeeds or the list empties and it returns a failure. Remember that functions will not be called unless the calls within do not fail, so this example can be shortened to:
Another simple generator is the
To demonstrate the power of this concept, consider string operations. Most languages include a function known as
This code will return 4, the position of the first occurrence of the word "the". To get the next instance of "the" an alternate form must be used,
the 5 at the end saying it should look from position 5 on. In order to extract all the occurrences of "the", a loop must be used...
Under Icon the find function is a generator, and will return the next instance of the string each time it is resumed before finally failing after it passes the end of the string. The same code under Icon can be written:
find will return the index of the next instance of "the" each time it is resumed by every, eventually passing the end of the string and failing. As in the prior example, this will cause write to fail, and the (one-line) every loop to exit.
Of course there are times where you deliberately want to find a string after some point in input, for instance, you might be scanning a text file containing data in multiple columns. Goal-directed execution works here as well, and can be used this way:
The position will only be returned if "the" appears after position 5, the comparison will fail otherwise, passing that failure to write as before. There is one small "trick" to this code that needs to be considered: comparisons return the right hand result, so it is important to put the find on the right hand side of the comparison. If the 5 were placed on the right, 5 would be written.
Icon adds several control structures for looping through
generators. The every operator is similar to while, looping through every item returned by a generator and exiting on failure:
Why use every instead of a while loop in this case?
Because while re-evaluates the first result,
but every produces all results.
The every syntax actually injects values into the function in a fashion similar to blocks under Smalltalk
. For instance, the above loop can be re-written this way:
Users can build new generators easily using the suspend keyword:
This example loops over theString using find to look for pattern. When one is found, and the position is odd, the location is returned from the function with suspend. Unlike return, suspend writes down where it is in the internal generators as well, allowing it to pick up where it left off on the next iteration.
is a short form of the examples shown earlier. In this case the subject of the
Substrings can be extracted from a string by using a range specification within brackets. A range specification can return a point to a single character, or a slice of the string. Strings can be indexed from either the right or the left. It is important to note that positions within a string are between the characters 1A2B3C4 and can be specified from the right -3A-2B-1C0
For example
Where the last example shows using a length instead of an ending position
The subscripting specification can be used as a Lvalue within an expression. This can be used to insert strings into another string or delete parts of a string. For example,
The items within a list can be of any sort, including other structures. To quickly build larger lists, Icon includes the
Like arrays in other languages, Icon allows items to be looked up by position, e.g.,
The bang-syntax, e.g.,
Icon includes stack-like functions,
Icon also includes functionality for sets and tables (known as hashes, associative arrays, dictionaries, etc.):
This code creates a table that will use zero as the default value of any unknown key. It then adds two items into it, with the keys "there" and "here", and values 1 and 2.
For example
would produce
Built-in and user defined functions can be used to move around within the string being scanned. Many of the built in functions will default to &subject and &pos (for example the find function). The following, for example, will write all blank delimited "words" in a string.
A more complicated example demonstrates the integration of generators and string scanning within the language.
The idiom of
Very high-level programming language
A very high-level programming language is a programming language with a very high level of abstraction, used primarily as a professional programmer productivity tool....
programming language
Programming language
A programming language is an artificial language designed to communicate instructions to a machine, particularly a computer. Programming languages can be used to create programs that control the behavior of a machine and/or to express algorithms precisely....
featuring goal directed execution and many facilities for managing strings
String (computer science)
In formal languages, which are used in mathematical logic and theoretical computer science, a string is a finite sequence of symbols that are chosen from a set or alphabet....
and textual patterns. It is related to SNOBOL
SNOBOL
SNOBOL is a generic name for the computer programming languages developed between 1962 and 1967 at AT&T Bell Laboratories by David J. Farber, Ralph E. Griswold and Ivan P. Polonsky, culminating in SNOBOL4...
and SL5, string processing languages. Icon is not object-oriented, but an object-oriented extension called Idol was developed in 1996 which eventually became Unicon
Unicon programming language
Unicon is a programming language designed by American computer scientist Clint Jeffery. Unicon descended from Icon and its preprocessor, IDOL, that offers better access to the operating system as well as support for object-oriented programming...
.
Basic syntax
The Icon language is derived from the ALGOLALGOL
ALGOL is a family of imperative computer programming languages originally developed in the mid 1950s which greatly influenced many other languages and became the de facto way algorithms were described in textbooks and academic works for almost the next 30 years...
-class of structured programming
Structured programming
Structured programming is a programming paradigm aimed on improving the clarity, quality, and development time of a computer program by making extensive use of subroutines, block structures and for and while loops - in contrast to using simple tests and jumps such as the goto statement which could...
languages, and thus has syntax similar to C
C (programming language)
C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....
or Pascal. Icon is most similar to Pascal, using
:=
syntax for assignments, the procedure
keyword and similar syntax. On the other hand, Icon uses C-style brackets for structuring execution groups, and programs start by running a procedure called "main".In many ways Icon also shares features with most scripting programming languages (as well as SNOBOL
SNOBOL
SNOBOL is a generic name for the computer programming languages developed between 1962 and 1967 at AT&T Bell Laboratories by David J. Farber, Ralph E. Griswold and Ivan P. Polonsky, culminating in SNOBOL4...
and SL5, from which they were taken): variables do not have to be declared, types are cast automatically, and numbers can be converted to strings and back automatically. Another feature common to many scripting languages, but not all, is the lack of a line-ending character; in Icon, lines not ended by a semicolon get ended by an implied semicolon if it makes sense.
Procedures are the basic building blocks of Icon programs, and although they use Pascal naming they work more like C functions and can return values; there is no
function
keyword in Icon.
procedure doSomething(aString)
write(aString)
end
Goal-directed execution
One of Icon's key concepts is that control structures are based on the "success" or "failure" of expressions, rather than on boolean logic, as in most other programming languages. Under this model, simple comparisons likeif a < b
do not mean "if the operations to the right evaluate to true" as they would under most languages; instead it means something more like "if the operations to the right succeed". In this case the < operator succeeds if the comparison is true, so the end result is the same. In addition, the < operator returns its second argument if it succeeds, allowing things like if a < b < c
, a common type of comparison that in most languages must be written as a disjunction of two inequalities like if a < b && b < c
.The utility of this concept becomes much clearer when you consider real-world examples. Since Icon uses success or failure for all flow control, this simple code:
if a := read then write(a)
Will copy one line of the standard input to standard output. What's interesting about this example is that the code will work even if the read causes an error, for instance, if the file does not exist. In that case the statement
a := read
will fail, and write will simply not be called.Success and failure are passed "up" through functions, meaning that a failure inside a nested function
Nested function
In computer programming, a nested function is a function which is lexically encapsulated within another function. It can only be called by the enclosing function or by functions directly or indirectly nested within the same enclosing function. In other words, the scope of the nested function is...
will cause the functions calling it to fail as well. For instance, we can write a program to copy an entire input file to output in a single line:
while write(read)
When the read command fails, at the end of file for instance, the failure will be passed up the chain and write will fail as well. The while, being a control structure, stops on failure, meaning it stops when the file is empty. For comparison, consider a similar example written in Java
Java (programming language)
Java is a programming language originally developed by James Gosling at Sun Microsystems and released in 1995 as a core component of Sun Microsystems' Java platform. The language derives much of its syntax from C and C++ but has a simpler object model and fewer low-level facilities...
-based pseudocode
Pseudocode
In computer science and numerical computation, pseudocode is a compact and informal high-level description of the operating principle of a computer program or other algorithm. It uses the structural conventions of a programming language, but is intended for human reading rather than machine reading...
:
try {
while ((a = read) != EOF) {
write(a);
}
} catch (Exception e) {
// do nothing, exit the loop
}
In this case there are two comparisons needed, one for end of file (EOF) and another for all other errors. Since Java does not allow errors to be compared as logic elements, as under Icon, the lengthy
try/catch
syntax must be used instead. Try blocks also impose a performance penalty for simply using them, even if no error occurs, a distributed cost that Icon avoids.Icon refers to this concept as goal-directed execution, referring to the way that execution continues until some goal is reached. In the example above the goal is to read the entire file; the read command continues to succeed while there is more information to be read, and fails when there isn't. The goal is thus coded directly in the language, instead of using statements checking return codes or similar constructs.
Generators
Expressions in Icon often return a single value, for instance, x < 5 will evaluate and succeed if the value of x is less than 5 or fail. However several of the examples below rely on the fact that many expressions do not immediately return success or failure, returning values in the meantime. This drives the examples with every and to; every causes to to continue to return values until it fails.This is a key concept in Icon, known as generators
Generator (computer science)
In computer science, a generator is a special routine that can be used to control the iteration behaviour of a loop. A generator is very similar to a function that returns an array, in that a generator has parameters, can be called, and generates a sequence of values...
. Generators drive much of the loop functionality in the language, but do so more directly; the programmer does not write a loop and then pull out and compare values, Icon will do all of this for you.
Within the parlance of Icon, the evaluation of an expression or function results in a result sequence. A result sequence contains all the possible values that can be generated by the expression or function. When the result sequence is exhausted (e.g. there are no more values within the result sequence), the expression or function fails. Iteration over the result sequence is achieved either implicitly via Icon's goal directed evaluation or explicitly via the every clause.
Icon includes several generator-builders. The alternator syntax allows a series of items to be generated in sequence until one fails:
1 | "hello" | x < 5
can generate "1", "hello", and "5" if x is less than 5. Alternators can be read as "or" in many cases, for instance:
if y < (x | 5) then write("y=", y)
will write out the value of y if it is smaller than x or 5. Internally Icon checks every value from left to right until one succeeds or the list empties and it returns a failure. Remember that functions will not be called unless the calls within do not fail, so this example can be shortened to:
write("y=", (x | 5) > y)
Another simple generator is the
to
, which generates lists of integers; every write(1 to 10)
will do exactly what it seems to. The bang syntax generates every item of a list; every write(!aString)
will output each character of aString on a new line.To demonstrate the power of this concept, consider string operations. Most languages include a function known as
find
or indexOf
that returns the location of a string within another. Consider:
s = "All the world's a stage. And all the men and women merely players";
i = indexOf("the", s)
This code will return 4, the position of the first occurrence of the word "the". To get the next instance of "the" an alternate form must be used,
i = indexOf("the", s, 5)
the 5 at the end saying it should look from position 5 on. In order to extract all the occurrences of "the", a loop must be used...
s = "All the world's a stage. And all the men and women merely players";
i = indexOf("the", s)
while i != -1 {
write(i);
i = indexOf("the", s, i+1);
}
Under Icon the find function is a generator, and will return the next instance of the string each time it is resumed before finally failing after it passes the end of the string. The same code under Icon can be written:
s := "All the world's a stage. And all the men and women merely players"
every write(find("the",s))
find will return the index of the next instance of "the" each time it is resumed by every, eventually passing the end of the string and failing. As in the prior example, this will cause write to fail, and the (one-line) every loop to exit.
Of course there are times where you deliberately want to find a string after some point in input, for instance, you might be scanning a text file containing data in multiple columns. Goal-directed execution works here as well, and can be used this way:
write(5 < find("the", s))
The position will only be returned if "the" appears after position 5, the comparison will fail otherwise, passing that failure to write as before. There is one small "trick" to this code that needs to be considered: comparisons return the right hand result, so it is important to put the find on the right hand side of the comparison. If the 5 were placed on the right, 5 would be written.
Icon adds several control structures for looping through
generators. The every operator is similar to while, looping through every item returned by a generator and exiting on failure:
every k := i to j do
write(someFunction(k))
Why use every instead of a while loop in this case?
Because while re-evaluates the first result,
but every produces all results.
The every syntax actually injects values into the function in a fashion similar to blocks under Smalltalk
Smalltalk
Smalltalk is an object-oriented, dynamically typed, reflective programming language. Smalltalk was created as the language to underpin the "new world" of computing exemplified by "human–computer symbiosis." It was designed and created in part for educational use, more so for constructionist...
. For instance, the above loop can be re-written this way:
every write(someFunction(i to j))
Users can build new generators easily using the suspend keyword:
procedure findOnlyOdd(pattern, theString)
every i := find(pattern, theString) do
if i % 2 = 1 then suspend i
end
This example loops over theString using find to look for pattern. When one is found, and the position is odd, the location is returned from the function with suspend. Unlike return, suspend writes down where it is in the internal generators as well, allowing it to pick up where it left off on the next iteration.
Strings
In keeping with its script-like functionality, Icon adds a number of features to make working with strings easier. Most notable among these is the scanning system, which repeatedly calls functions on a string:s ? write(find("the"))
is a short form of the examples shown earlier. In this case the subject of the
find
function is placed outside the parameters in front of the question-mark. Icon functions are deliberately (as opposed to automatically) written to identify the subject in parameter lists and allow them to be pulled out in this fashion.Substrings can be extracted from a string by using a range specification within brackets. A range specification can return a point to a single character, or a slice of the string. Strings can be indexed from either the right or the left. It is important to note that positions within a string are between the characters 1A2B3C4 and can be specified from the right -3A-2B-1C0
For example
"Wikipedia"[1] > "W"
"Wikipedia"[3]
> "k"
"Wikipedia"[0] > "a"
"Wikipedia"[1:3]
> "Wi"
"Wikipedia"[-2:0] > "ia"
"Wikipedia"[2+:3]
> "iki"
Where the last example shows using a length instead of an ending position
The subscripting specification can be used as a Lvalue within an expression. This can be used to insert strings into another string or delete parts of a string. For example,
s := "abc"
s[2] := "123"
s now has a value of "a123c"
s := "abcdefg"
s[3:5] := "ABCD"
s now has a value of "abABCDefg"
s := "abcdefg"
s[3:5] := ""
s now has a value of "abefg"
Other structures
Icon also allows the user to easily construct their own lists (or arrays):aCat := ["muffins", "tabby", 2002, 8]
The items within a list can be of any sort, including other structures. To quickly build larger lists, Icon includes the
list
generator; i := list(10, "word")
generates a list containing 10 copies of "word".Like arrays in other languages, Icon allows items to be looked up by position, e.g.,
weight := aCat[4]
.The bang-syntax, e.g.,
every write(!aCat)
, will print out four lines, each with one element.Icon includes stack-like functions,
push
and pop
to allow them to form the basis of stacks and queues.Icon also includes functionality for sets and tables (known as hashes, associative arrays, dictionaries, etc.):
symbols := table(0)
symbols["there"] := 1
symbols["here"] := 2
This code creates a table that will use zero as the default value of any unknown key. It then adds two items into it, with the keys "there" and "here", and values 1 and 2.
String scanning
One of the powerful features of Icon is string scanning. The scan string operator,?
saves the current string scanning environment and creates a new string scanning environment. The string scanning environment consists of two keyword variables, &subject
and &pos
. Where &subject is the string being scanned, and &pos is the cursor or current position within the subject string.For example
s := "this is a string"
s ? write("subject=[",&subject,"] pos=[",&pos,"]")
would produce
subject=[this is a string] pos=[1]
Built-in and user defined functions can be used to move around within the string being scanned. Many of the built in functions will default to &subject and &pos (for example the find function). The following, for example, will write all blank delimited "words" in a string.
s := "this is a string"
s ? { # Establish string scanning environment
while not pos(0) do { # Test for end of string
tab(many(' ')) # Skip past any blanks
word := tab(upto(' ') | 0) # the next word is up to the next blank -or- the end of the line
write(word) # write the word
}
}
A more complicated example demonstrates the integration of generators and string scanning within the language.
procedure main
s := "Mon Dec 8"
s ? write(Mdate | "not a valid date")
end
# Define a matching function that returns
# a string that matches a day month dayofmonth
procedure Mdate
# Define some initial values
static dates
static days
initial {
days := ["Mon","Tue","Wed","Thr","Fri","Sat","Sun"]
dates := ["Jan","Feb","Mar","Apr","May","Jun",
"Jul","Aug","Sep","Oct","Nov","Dec"]
}
every suspend (retval <- tab(match(!days)) || # Match a day
=" " || # Followed by a blank
tab(match(!dates)) || # Followed by the month
=" " || # Followed by a blank
matchdigits(2) # Followed by at least 2 digits
) &
(=" " | pos(0) ) & # Either a blank or the end of the string
retval # And finally return the string
end
# Matching function that returns a string of n digits
procedure matchdigits(n)
suspend (v := tab(many(&digits)) & *v <= n) & v
end
The idiom of
expr1 & expr2 & expr3
returns the value of the last expressionExternal links
- Icon homepage
- Oral history interview with Stephen Wampler, Charles Babbage InstituteCharles Babbage InstituteThe Charles Babbage Institute is a research center at the University of Minnesota specializing in the history of information technology, particularly the history since 1935 of digital computing, programming/software, and computer networking....
, University of Minnesota. Wampler discusses his work on the development of the Icon programming languageIcon programming languageIcon is a very high-level programming language featuring goal directed execution and many facilities for managing strings and textual patterns. It is related to SNOBOL and SL5, string processing languages...
in the late 1970s at the University of Arizona under Ralph GriswoldRalph GriswoldRalph E. Griswold was a computer scientist known for his research into high-level programming languages and symbolic computation. His language credits include the string processing language SNOBOL, SL5, and Icon.He attended Stanford University, receiving a bachelor's degree in physics, then an...
. - Oral history interview with Robert Goldberg, Charles Babbage InstituteCharles Babbage InstituteThe Charles Babbage Institute is a research center at the University of Minnesota specializing in the history of information technology, particularly the history since 1935 of digital computing, programming/software, and computer networking....
, University of Minnesota. Goldberg discusses his interaction with Ralph GriswoldRalph GriswoldRalph E. Griswold was a computer scientist known for his research into high-level programming languages and symbolic computation. His language credits include the string processing language SNOBOL, SL5, and Icon.He attended Stanford University, receiving a bachelor's degree in physics, then an...
when working on the Icon programming languageIcon programming languageIcon is a very high-level programming language featuring goal directed execution and many facilities for managing strings and textual patterns. It is related to SNOBOL and SL5, string processing languages...
. - Oral history interview with Kenneth Walker, Charles Babbage InstituteCharles Babbage InstituteThe Charles Babbage Institute is a research center at the University of Minnesota specializing in the history of information technology, particularly the history since 1935 of digital computing, programming/software, and computer networking....
, University of Minnesota. Walker describes the work environment of the Icon project, his interactions with Ralph GriswoldRalph GriswoldRalph E. Griswold was a computer scientist known for his research into high-level programming languages and symbolic computation. His language credits include the string processing language SNOBOL, SL5, and Icon.He attended Stanford University, receiving a bachelor's degree in physics, then an...
, and his own work on an Icon compiler. - Oral history interview with Robert Goldberg, Charles Babbage InstituteCharles Babbage InstituteThe Charles Babbage Institute is a research center at the University of Minnesota specializing in the history of information technology, particularly the history since 1935 of digital computing, programming/software, and computer networking....
, University of Minnesota. Goldberg describes his use of IconIcon programming languageIcon is a very high-level programming language featuring goal directed execution and many facilities for managing strings and textual patterns. It is related to SNOBOL and SL5, string processing languages...
in the classroom at Illinois Institute of TechnologyIllinois Institute of TechnologyIllinois Institute of Technology, commonly called Illinois Tech or IIT, is a private Ph.D.-granting university located in Chicago, Illinois, with programs in engineering, science, psychology, architecture, business, communications, industrial technology, information technology, design, and law...
. - http://rosettacode.org/wiki/Category:IconThe Icon Programming Language page on The Rosetta Code comparative programming tasks project siteRosetta CodeRosetta Code is a wiki-based programming chrestomathy website with solutions to various programming problems in many different programming languages. It was created in 2007 by Mike Mol. Rosetta Code includes 450 programming tasks, and covers 351 programming languages...
]