C preprocessor
Encyclopedia
The C preprocessor is the preprocessor
for the C
and C++
computer programming language
s. The preprocessor handles directives for source file
inclusion (
In many C implementations, it is a separate program
invoked by the compiler
as the first part of translation.
The language of preprocessor directives is agnostic to the grammar of C, so the C preprocessor can also be used independently to process other kinds of text files.
The preprocessor replaces the line
of that name, which declares the
among other things. More precisely, the entire text of the file 'stdio.h' replaces the
This can also be written using double quotes, e.g.
By convention, include files are given a .h extension, and files not included by others are given a .c extension. However, there is no requirement that this be observed. Occasionally you will see files with other extensions included: files with a .def extension may denote files designed to be included multiple times, each time expanding the same repetitive content;
image file (which is at the same time a C source file).
s or
to prevent double inclusion.
.
Most compilers targeting Microsoft Windows
implicitly define
The example code tests if a macro
A more complex
Translation can also be caused to fail by using the
The function-like macro declaration must not have any whitespace between the identifier and the first, opening, parenthesis. If whitespace is present, the macro will be interpreted as object-like with everything starting from the first parenthesis added to the token list.
Whenever the identifier appears in the source code it is replaced with the replacement token list, which can be empty. For an identifier declared to be a function-like macro, it is only replaced when the following token is also a left parenthesis that begins the argument list of the macro invocation. The exact procedure followed for expansion of function-like macros with arguments is subtle.
Object-like macros were conventionally used as part of good programming practice to create symbolic names for constants, e.g.
... instead of hard-coding the numbers throughout the code. An alternative in both C and C++, especially in situations in which a pointer to the number is required, is to apply the
An example of a function-like macro is:
This defines a radian
s-to-degrees conversion which can be inserted in the code where required, i.e.,
Certain symbols are required to be defined by an implementation during preprocessing. These include
prints the value of
The first C Standard specified that the macro
Other Standard macros include
The second edition of the C Standard, C99
, added support for
Macros that can take a varying number of arguments (variadic macro
s) are not allowed in C89, but were introduced by a number of compilers and standardised in C99
. Variadic macros are particularly useful when writing wrappers to functions taking a variable number of parameters, such as
One little-known usage pattern of the C preprocessor is known as "X-Macros". An X-Macro is a header file
. Commonly these use the extension ".def" instead of the traditional ".h". This file contains a list of similar macro calls, which can be referred to as "component macros". The include file is then referenced repeatedly.
Compiler-specific predefined macros are usually listed in the compiler documentation, although this is often incomplete.
Some compilers can be made to dump at least some of their useful predefined macros, for example:
GNU C Compiler
:
HP-UX
ansi C compiler:
SCO OpenServer
C compiler:
Sun Studio
C/C++ compiler:
IBM AIX XL C/C++ compiler:
C99 introduced a few standard
system.
GPP does work acceptably with most assembly language
s. GNU mentions assembly as one of the target languages among C, C++ and Objective-C in the documentation of its implementation of the preprocessor. This requires that the assembler syntax not conflict with GPP syntax, which means no lines starting with
s and thus ignores, don't have syntactical meaning other than that.
However, since the C preprocessor does not have features of other preprocessors, such as recursive macros, selective expansion according to quoting, string evaluation in conditionals, and Turing completeness, it is very limited in comparison to a more modern, true GPP such as m4
. For instance, the inability to define macros using other macros requires code to be broken into more sections than would be required.
See also
External links
Preprocessor
In computer science, a preprocessor is a program that processes its input data to produce output that is used as input to another program. The output is said to be a preprocessed form of the input data, which is often used by some subsequent programs like compilers...
for the C
C (programming language)
C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....
and C++
C++
C++ is a statically typed, free-form, multi-paradigm, compiled, general-purpose programming language. It is regarded as an intermediate-level language, as it comprises a combination of both high-level and low-level language features. It was developed by Bjarne Stroustrup starting in 1979 at Bell...
computer programming language
Programming language
A programming language is an artificial language designed to communicate instructions to a machine, particularly a computer. Programming languages can be used to create programs that control the behavior of a machine and/or to express algorithms precisely....
s. The preprocessor handles directives for source file
Source code
In computer science, source code is text written using the format and syntax of the programming language that it is being written in. Such a language is specially designed to facilitate the work of computer programmers, who specify the actions to be performed by a computer mostly by writing source...
inclusion (
#include
), macro definitions (#define
), and conditional inclusion (#if
).In many C implementations, it is a separate program
Computer program
A computer program is a sequence of instructions written to perform a specified task with a computer. A computer requires programs to function, typically executing the program's instructions in a central processor. The program has an executable form that the computer can use directly to execute...
invoked by the compiler
Compiler
A compiler is a computer program that transforms source code written in a programming language into another computer language...
as the first part of translation.
The language of preprocessor directives is agnostic to the grammar of C, so the C preprocessor can also be used independently to process other kinds of text files.
Phases
The first four (of eight) phases of translation specified in the C Standard are:- Trigraph replacement: The preprocessor replaces trigraph sequencesC trigraphIn computer programming, digraphs and trigraphs are sequences of two and three characters respectively, appearing in source code, which a programming language specification requires an implementation of that language to treat as if they were one other character.Various reasons exist for using...
with the characters they represent. - Line splicing: Physical source lines that are continued with escaped newline sequences are spliced to form logical lines.
- Tokenization: The preprocessor breaks the result into preprocessing tokens and whitespace. It replaces comments with whitespace.
- Macro expansion and directive handling: Preprocessing directive lines, including file inclusion and conditional compilation, are executed. The preprocessor simultaneously expands macros and, in the 1999 version of the C standard, handles
_Pragma
operators.
Including files
One of the most common uses of the preprocessor is to include another file:The preprocessor replaces the line
#include <stdio.h>
with the system header fileHeader file
Some programming languages use header files. These files allow programmers to separate certain elements of a program's source code into reusable files. Header files commonly contain forward declarations of classes, subroutines, variables, and other identifiers...
of that name, which declares the
printf
functionSubroutine
In computer science, a subroutine is a portion of code within a larger program that performs a specific task and is relatively independent of the remaining code....
among other things. More precisely, the entire text of the file 'stdio.h' replaces the
#include
directive.This can also be written using double quotes, e.g.
#include "stdio.h"
. If the filename is enclosed within angle brackets, the file is searched for in the standard compiler include paths. If the filename is enclosed within double quotes, the search path is expanded to include the current source directory. C compilers and programming environments all have a facility which allows the programmer to define where include files can be found. This can be introduced through a command line flag, which can be parameterized using a makefile, so that a different set of include files can be swapped in for different operating systems, for instance.By convention, include files are given a .h extension, and files not included by others are given a .c extension. However, there is no requirement that this be observed. Occasionally you will see files with other extensions included: files with a .def extension may denote files designed to be included multiple times, each time expanding the same repetitive content;
#include "icon.xbm"
is likely to refer to an XBMXBM
In computer graphics, the X Window System uses X BitMap , a plain text binary image format, for storing cursor and icon bitmaps used in the X GUI.XBM files differ markedly from most image files in that they take the form of C source files...
image file (which is at the same time a C source file).
#include
often compels the use of #include
guardInclude guard
In the C and C++ programming languages, an #include guard, sometimes called a macro guard, is a particular construct used to avoid the problem of double inclusion when dealing with the #include directive...
s or
#pragma once
Pragma once
In the C and C++ programming languages, #pragma once is a non-standard but widely supported preprocessor directive designed to cause the current source file to be included only once in a single compilation...
to prevent double inclusion.
Conditional compilation
The#if
, #ifdef
, #ifndef
, #else
, #elif
and #endif
directives can be used for conditional compilationConditional compilation
In computer programming, conditional compilation is compilation implementing methods which allow the compiler to produce differences in the executable produced controlled by parameters that are provided during compilation...
.
Most compilers targeting Microsoft Windows
Microsoft Windows
Microsoft Windows is a series of operating systems produced by Microsoft.Microsoft introduced an operating environment named Windows on November 20, 1985 as an add-on to MS-DOS in response to the growing interest in graphical user interfaces . Microsoft Windows came to dominate the world's personal...
implicitly define
_WIN32
. This allows code, including preprocessor commands, to compile only when targeting Windows systems. A few compilers define WIN32
instead. For such compilers that do not implicitly define the _WIN32
macro, it can be specified on the compiler's command line, using -D_WIN32
.The example code tests if a macro
__unix__
is defined. If it is, the file <unistd.h>
is then included. Otherwise, it tests if a macro _WIN32
is defined instead. If it is, the file <windows.h>
is then included.A more complex
#if
example can use operators, for example something like:Translation can also be caused to fail by using the
#error
directive:Macro definition and expansion
There are two types of macros, object-like and function-like. Object-like macros do not take parameters; function-like macros do. The generic syntax for declaring an identifier as a macro of each type is, respectively:The function-like macro declaration must not have any whitespace between the identifier and the first, opening, parenthesis. If whitespace is present, the macro will be interpreted as object-like with everything starting from the first parenthesis added to the token list.
Whenever the identifier appears in the source code it is replaced with the replacement token list, which can be empty. For an identifier declared to be a function-like macro, it is only replaced when the following token is also a left parenthesis that begins the argument list of the macro invocation. The exact procedure followed for expansion of function-like macros with arguments is subtle.
Object-like macros were conventionally used as part of good programming practice to create symbolic names for constants, e.g.
... instead of hard-coding the numbers throughout the code. An alternative in both C and C++, especially in situations in which a pointer to the number is required, is to apply the
const
qualifier to a global variable. This causes the value to be stored in memory, instead of being substituted by the preprocessor.An example of a function-like macro is:
This defines a radian
Radian
Radian is the ratio between the length of an arc and its radius. The radian is the standard unit of angular measure, used in many areas of mathematics. The unit was formerly a SI supplementary unit, but this category was abolished in 1995 and the radian is now considered a SI derived unit...
s-to-degrees conversion which can be inserted in the code where required, i.e.,
RADTODEG(34)
. This is expanded in-place, so that repeated multiplication by the constant is not shown throughout the code. The macro here is written as all uppercase to emphasize that it is a macro, not a compiled function. The second x is enclosed in its own pair of parentheses, avoiding calculations in the wrong order if an expression instead of a single value is passed.Certain symbols are required to be defined by an implementation during preprocessing. These include
__FILE__
and __LINE__
, predefined by the preprocessor itself, which expand into the current file and line number. For instance the following:prints the value of
x
, preceded by the file and line number to the error stream, allowing quick access to which line the message was produced on. Note that the WHERESTR
argument is concatenated with the string following it.The first C Standard specified that the macro
__STDC__
be defined to 1 if the implementation conforms to the ISO Standard and 0 otherwise, and the macro __STDC_VERSION__
defined as a numeric literal specifying the version of the Standard supported by the implementation. Standard C++ compilers support the __cplusplus
macro. Compilers running in non-standard mode must not set these macros, or must define others to signal the differences.Other Standard macros include
__DATE__
, the current date, and __TIME__
, the current time.The second edition of the C Standard, C99
C99
C99 is a modern dialect of the C programming language. It extends the previous version with new linguistic and library features, and helps implementations make better use of available computer hardware and compiler technology.-History:...
, added support for
__func__
, which contains the name of the function definition within which it is contained, but because the preprocessor is agnostic to the grammar of C, this must be done in the compiler itself using a variable local to the function.Macros that can take a varying number of arguments (variadic macro
Variadic macro
A variadic macro is a feature of the C preprocessor whereby a macro may be declared to accept a varying number of arguments.Variable-argument macros were introduced in the ISO/IEC 9899:1999 revision of the C programming language standard in 1999...
s) are not allowed in C89, but were introduced by a number of compilers and standardised in C99
C99
C99 is a modern dialect of the C programming language. It extends the previous version with new linguistic and library features, and helps implementations make better use of available computer hardware and compiler technology.-History:...
. Variadic macros are particularly useful when writing wrappers to functions taking a variable number of parameters, such as
printfPrintfPrintf format string refers to a control parameter used by a class of functions typically associated with some types of programming languages. The format string specifies a method for rendering an arbitrary number of varied data type parameter into a string...
, for example when logging warnings and errors.One little-known usage pattern of the C preprocessor is known as "X-Macros". An X-Macro is a header file
Header file
Some programming languages use header files. These files allow programmers to separate certain elements of a program's source code into reusable files. Header files commonly contain forward declarations of classes, subroutines, variables, and other identifiers...
. Commonly these use the extension ".def" instead of the traditional ".h". This file contains a list of similar macro calls, which can be referred to as "component macros". The include file is then referenced repeatedly.
Compiler-specific predefined macros are usually listed in the compiler documentation, although this is often incomplete.
Some compilers can be made to dump at least some of their useful predefined macros, for example:
GNU C Compiler
GNU Compiler Collection
The GNU Compiler Collection is a compiler system produced by the GNU Project supporting various programming languages. GCC is a key component of the GNU toolchain...
:
gcc -dM -E - < /dev/null/dev/nullIn Unix-like operating systems, /dev/null or the null device is a special file that discards all data written to it and provides no data to any process that reads from it ....
HP-UX
HP-UX
HP-UX is Hewlett-Packard's proprietary implementation of the Unix operating system, based on UNIX System V and first released in 1984...
ansi C compiler:
cc -v fred.c
(where fred.c
is a simple test file)SCO OpenServer
SCO OpenServer
SCO OpenServer, previously SCO UNIX and SCO Open Desktop , is, misleadingly, a closed source version of the Unix computer operating system developed by Santa Cruz Operation and now maintained by the SCO Group....
C compiler:
cc -## fred.c
(where fred.c
is a simple test file)Sun Studio
Sun Studio (software)
The Oracle Solaris Studio compiler suite is Oracle's flagship software development product for Solaris and Linux. It was formerly known as Sun Studio...
C/C++ compiler:
cc -## fred.c
(where fred.c
is a simple test file)IBM AIX XL C/C++ compiler:
cc -qshowmacros -E fred.c
(where fred.c
is a simple test file)User-defined compilation errors and warnings
The#error
directive outputs a message through the error stream.Compiler-specific preprocessor features
The#pragma
directive is a compiler specific directive which compiler vendors may use for their own purposes. For instance, a #pragma
is often used to allow suppression of specific error messages, manage heap and stack debugging and so on.C99 introduced a few standard
#pragma
directives, taking the form #pragma STDC ...
, which are used to control the floating-point implementation.- Many implementations do not support trigraphs or do not replace them by default.
- Many implementations (including, e.g., the C-compilers by GNU, Intel, and IBM) provide a non-standard
#warning
directive to print out a warning message in the output, but not stop the compilation process. A typical use is to warn about the usage of some old code, which is now deprecated and only included for compatibility reasons, e.g.:
- Some Unix preprocessors traditionally provided "assertions", which have little similarity to assertionAssertion (computing)In computer programming, an assertion is a predicate placed in a program to indicate that the developer thinks that the predicate is always true at that place.For example, the following code contains two assertions:...
s used in programming. - GCC provides
#include_next
for chaining headers of the same name. - Objective-C preprocessors have
#import
, which is like#include
but only includes the file once.
As a general-purpose preprocessor (GPP)
Since the C preprocessor can be invoked independently to process files other than those containing to-be-compiled source code, it can also be used as a "general purpose preprocessor" (GPP) for other types of text processing. One particularly notable example is the now-deprecated imakeImake
imake is a build automation system implemented on top of the C preprocessor.imake generates makefiles from a template, a set of cpp macro functions, and a per-directory input file called an Imakefile...
system.
GPP does work acceptably with most assembly language
Assembly language
An assembly language is a low-level programming language for computers, microprocessors, microcontrollers, and other programmable devices. It implements a symbolic representation of the machine codes and other constants needed to program a given CPU architecture...
s. GNU mentions assembly as one of the target languages among C, C++ and Objective-C in the documentation of its implementation of the preprocessor. This requires that the assembler syntax not conflict with GPP syntax, which means no lines starting with
#
and that double quotes, which cpp interprets as string literalString literal
A string literal is the representation of a string value within the source code of a computer program. There are numerous alternate notations for specifying string literals, and the exact notation depends on the individual programming language in question...
s and thus ignores, don't have syntactical meaning other than that.
However, since the C preprocessor does not have features of other preprocessors, such as recursive macros, selective expansion according to quoting, string evaluation in conditionals, and Turing completeness, it is very limited in comparison to a more modern, true GPP such as m4
M4 (computer language)
m4 is a general purpose macro processor designed by Brian Kernighan and Dennis Ritchie. m4 is an extension of an earlier macro processor m3, written by Ritchie for the AP-3 minicomputer.-Use:...
. For instance, the inability to define macros using other macros requires code to be broken into more sections than would be required.
See also
- C syntaxC syntaxThe syntax of the C programming language is a set of rules that specifies whether the sequence of characters in a file is conforming C source code...
- Make
- PreprocessorPreprocessorIn computer science, a preprocessor is a program that processes its input data to produce output that is used as input to another program. The output is said to be a preprocessed form of the input data, which is often used by some subsequent programs like compilers...
- m4 (computer language)M4 (computer language)m4 is a general purpose macro processor designed by Brian Kernighan and Dennis Ritchie. m4 is an extension of an earlier macro processor m3, written by Ritchie for the AP-3 minicomputer.-Use:...
External links C syntax
The syntax of the C programming language is a set of rules that specifies whether the sequence of characters in a file is conforming C source code...
Preprocessor
In computer science, a preprocessor is a program that processes its input data to produce output that is used as input to another program. The output is said to be a preprocessed form of the input data, which is often used by some subsequent programs like compilers...
M4 (computer language)
m4 is a general purpose macro processor designed by Brian Kernighan and Dennis Ritchie. m4 is an extension of an earlier macro processor m3, written by Ritchie for the AP-3 minicomputer.-Use:...
- ISO/IEC 9899. The official C:1999 standard, along with defect reports and a rationale. As of 2005 the latest version is ISO/IEC 9899:TC2.
- GNU CPP online manual
- Visual Studio .NET preprocessor reference
- Pre-defined C/C++ Compiler Macros project: lists "various pre-defined compiler macros that can be used to identify standards, compilers, operating systems, hardware architectures, and even basic run-time libraries at compile-time"