Heredoc
Encyclopedia
A here document is a way of specifying a string literal
in command line shells including Unix shells (sh
, csh
, ksh
, Bash and zsh
) and in programming or scripting languages such as Perl
, PHP
, Python
, PowerShell and Ruby
. It preserves the line breaks and other whitespace (including indentation) in the text. Some languages allow variable substitution and command substitution
inside the string.
The most common syntax for here documents is
identifier
, followed, starting on the next line, by the text to be quoted, and then closed by the same identifier on its own line. Under the Unix shells, here documents are generally used as a way of providing input to commands.
, but some tools provide similar functionality by means of different conventions, and under different names.
using a here document.
Appending a minus sign to the << has the effect that leading tabs are ignored. This allows indenting here documents in shell scripts without changing their value. (Note that you will probably need to type CTRL-V, TAB to actually enter a TAB character on the command line. Tab emulated here with spaces; don't copy and paste.)
By default variables and also commands in backticks are evaluated:
This can be disabled by quoting any part of the label. For example by setting it in single or double quotes:
Also you can use a here-string in bash, ksh or zsh:
, here documents are referred to as here-strings. A here-string is a string which starts with an open delimiter (
Using a here-string with double quotes allows variables to be interpreted, using single quotes doesn't.
Variable interpolation occurs with simple variables (e.g.
You can execute a set of statements by putting them in
In the following PowerShell code, text is passed to a function using a here-string.
The function
PS> function ConvertTo-UpperCase($string) { $string.ToUpper }
PS> ConvertTo-UpperCase @'
>> one two three
>> eins zwei drei
>> '@
>>
ONE TWO THREE
EINS ZWEI DREI
Here is an example that demonstrates variable interpolation and statement execution using a here-string with double quotes:
$doc, $marty = 'Dr. Emmett Brown', 'Marty McFly'
$time = [DateTime]'Friday, October 25, 1985 8:00:00 AM'
$diff = New-TimeSpan -Minutes 25
@"
$doc : Are those my clocks I hear?
$marty : Yeah! Uh, it's $($time.Hour) o'clock!
$doc : Perfect! My experiment worked! They're all exactly $($diff.Minutes) minutes slow.
$marty : Wait a minute. Wait a minute. Doc... Are you telling me that it's $(($time + $diff).ToShortTimeString)?
$doc : Precisely.
$marty : Damn! I'm late for school!
"@
Output:
Dr. Emmett Brown : Are those my clocks I hear?
Marty McFly : Yeah! Uh, it's 8 o'clock!
Dr. Emmett Brown : Perfect! My experiment worked! They're all exactly 25 minutes slow.
Marty McFly : Wait a minute. Wait a minute. Doc... Are you telling me that it's 08:25?
Dr. Emmett Brown : Precisely.
Marty McFly : Damn! I'm late for school!
Using a here-string with single quotes instead, the output would look like this:
$doc : Are those my clocks I hear?
$marty : Yeah! Uh, it's $($time.Hour) o'clock!
$doc : Perfect! My experiment worked! They're all exactly $($diff.Minutes) minutes slow.
$marty : Wait a minute. Wait a minute. Doc... Are you telling me that it's $(($time + $diff).ToShortTimeString)?
$doc : Precisely.
$marty : Damn! I'm late for school!
has support for delimiter strings using the 'q' prefix character. These strings begin and end with a delimiter character (any of <> {} or []), or an identifier immediately followed by a newline.
The following D code shows 2 examples using delimiter characters and an identifier.
Using an identifier.
Note that the here doc does not start at the tag -- but rather starts on the next line. So the statement containing the tag continues on after the tag.
Here is an example with double quotes:
Output:
Dear Spike,
I wish you to leave Sunnydale and never return.
Not Quite Love,
Buffy the Vampire Slayer
Here is an example with single quotes:
Output:
Dear $recipient,
I wish you to leave Sunnydale and never return.
Not Quite Love,
$sender
And an example with backticks (may not be portable):
Outputs
This is a heredoc section.
For more information talk to Joe Smith, your local Programmer.
Thanks!
hey joe smith! you can actually assign the heredoc section to a variable!
The line containing the closing identifier must not contain any other characters, except an optional ending semicolon. Otherwise, it will not be considered to be a closing identifier, and PHP will continue looking for one. If a proper closing identifier is not found, a parse error will result with the line number being at the end of the script.
In PHP 5.3 and later, like Perl, it is possible to not interpolate variables by surrounding the tag with single quotes; this is called a nowdoc:
In PHP 5.3+ it is also possible to surround the tag with double quotes, which like Perl has the same effect as not surrounding the tag with anything at all.
supports string literals delimited by single or double quotes repeated three times (i.e.
A simple Python 3 compatible example that yields the same result as the first Perl example above, is:
Replace the call to the
The content of the string includes all characters between the
Outputs:
This is a simple here string in Racket.
* One
* Two
* Three
No escape sequences are recognized between the starting and terminating lines; all characters are included in the string (and terminator) literally.
Outputs:
This string spans for multiple lines
and can contain any Unicode symbol.
So things like λ, ☠, α, β, are all fine.
In the next line comes the terminator. It can contain any Unicode symbol as well, even spaces and smileys!
Here strings can be used normally in contexts where normal strings would:
Outputs:
Dear Isaac,
Thanks for the insightful conversation yesterday.
Carl
An interesting alternative is to use the language extension
They look like this:
Outputs:
This is a long string,
very convenient when a
long chunk of text is
needed.
No worries about escaping
"quotes". It's also okay
to have λ, γ, θ, ...
Embed code: 7
An @-expression is not specific nor restricted to strings, it is a syntax form that can be composed with the rest of the language.
The result:
$ ruby grocery-list.rb
Grocery list
------------
1. Salad mix.
2. Strawberries.*
3. Cereal.
4. Milk.*
* Organic
Writing to a file with a here document involves a profusion of chevron (
Ruby also allows for the delimiting identifier not to start on the first column of a line, if the start of the here document is marked with the slightly different starter "<<-".
Besides, Ruby treats here documents as a double-quoted string, and as such, it is possible to use the #{} construct to interpolate code.
The following example illustrates both of these features :
Ruby also allows for starting multiple here documents in one line:
has no special syntax for heredocs, because the ordinary string syntaxes already allow embedded newlines and preserve indentation. Brace-delimited strings have no substitution (interpolation):
Quote-delimited strings are substituted at runtime:
In brace-delimited strings, there is the restriction that they must be balanced with respect to unescaped braces. In quote-delimited strings, braces can be unbalanced but backslashes, dollar signs, and left brackets all trigger substitution, and the first unescaped double quote terminates the string.
A point to note is that both the above strings have a newline as first and last character, since that is what comes immediately after and before respectively the delimiters.
Similarly,
An inline file is terminated with
String literal
A string literal is the representation of a string value within the source code of a computer program. There are numerous alternate notations for specifying string literals, and the exact notation depends on the individual programming language in question...
in command line shells including Unix shells (sh
Bourne shell
The Bourne shell, or sh, was the default Unix shell of Unix Version 7 and most Unix-like systems continue to have /bin/sh - which will be the Bourne shell, or a symbolic link or hard link to a compatible shell - even when more modern shells are used by most users.Developed by Stephen Bourne at AT&T...
, csh
C shell
The C shell is a Unix shell that was created by Bill Joy while a graduate student at University of California, Berkeley in the late 1970s. It has been distributed widely, beginning with the 2BSD release of the BSD Unix system that Joy began distributing in 1978...
, ksh
Korn shell
The Korn shell is a Unix shell which was developed by David Korn in the early 1980s and announced at USENIX on July 14, 1983. Other early contributors were AT&T Bell Labs developers Mike Veach, who wrote the emacs code, and Pat Sullivan, who wrote the vi code...
, Bash and zsh
Z shell
The Z shell is a Unix shell that can be used as an interactive login shell and as a powerful command interpreter for shell scripting...
) and in programming or scripting languages such as Perl
Perl
Perl is a high-level, general-purpose, interpreted, dynamic programming language. Perl was originally developed by Larry Wall in 1987 as a general-purpose Unix scripting language to make report processing easier. Since then, it has undergone many changes and revisions and become widely popular...
, PHP
PHP
PHP is a general-purpose server-side scripting language originally designed for web development to produce dynamic web pages. For this purpose, PHP code is embedded into the HTML source document and interpreted by a web server with a PHP processor module, which generates the web page document...
, Python
Python (programming language)
Python is a general-purpose, high-level programming language whose design philosophy emphasizes code readability. Python claims to "[combine] remarkable power with very clear syntax", and its standard library is large and comprehensive...
, PowerShell and Ruby
Ruby (programming language)
Ruby is a dynamic, reflective, general-purpose object-oriented programming language that combines syntax inspired by Perl with Smalltalk-like features. Ruby originated in Japan during the mid-1990s and was first developed and designed by Yukihiro "Matz" Matsumoto...
. It preserves the line breaks and other whitespace (including indentation) in the text. Some languages allow variable substitution and command substitution
Command substitution
In computing, command substitution is a facility originally introduced in the Unix shells that allows a command to be run and its output to be pasted back on the command line as arguments to another command...
inside the string.
The most common syntax for here documents is
<<
followed by a delimitingDelimiter
A delimiter is a sequence of one or more characters used to specify the boundary between separate, independent regions in plain text or other data streams. An example of a delimiter is the comma character, which acts as a field delimiter in a sequence of comma-separated values.Delimiters represent...
identifier
Identifier
An identifier is a name that identifies either a unique object or a unique class of objects, where the "object" or class may be an idea, physical [countable] object , or physical [noncountable] substance...
, followed, starting on the next line, by the text to be quoted, and then closed by the same identifier on its own line. Under the Unix shells, here documents are generally used as a way of providing input to commands.
Specific implementations
The following sections provide examples of specific implementations in different programming languages and environments. The most common syntax is substantially similar to that used by the Unix shellUnix shell
A Unix shell is a command-line interpreter or shell that provides a traditional user interface for the Unix operating system and for Unix-like systems...
, but some tools provide similar functionality by means of different conventions, and under different names.
Unix Shells
In the following example, text is passed to thetr
commandTr (Unix)
tr is a command in Unix-like operating systems.When executed, the program reads from the standard input and writes to the standard output. It takes as parameters two sets of characters, and replaces occurrences of the characters in the first set with the corresponding elements from the other set...
using a here document.
END_TEXT
was used as the delimiting identifier. It specified the start and end of the here document. ONE TWO THREE
and UNO DOS TRES
are outputs from tr
after execution.Appending a minus sign to the << has the effect that leading tabs are ignored. This allows indenting here documents in shell scripts without changing their value. (Note that you will probably need to type CTRL-V, TAB to actually enter a TAB character on the command line. Tab emulated here with spaces; don't copy and paste.)
By default variables and also commands in backticks are evaluated:
This can be disabled by quoting any part of the label. For example by setting it in single or double quotes:
Also you can use a here-string in bash, ksh or zsh:
Windows PowerShell
In Windows PowerShellWindows PowerShell
Windows PowerShell is Microsoft's task automation framework, consisting of a command-line shell and associated scripting language built on top of, and integrated with the .NET Framework...
, here documents are referred to as here-strings. A here-string is a string which starts with an open delimiter (
@"
or @'
) and ends with a close delimiter ("@
or '@
) on a line by itself, which terminates the string. All characters between the open and close delimiter are considered the string literal.Using a here-string with double quotes allows variables to be interpreted, using single quotes doesn't.
Variable interpolation occurs with simple variables (e.g.
$x
but NOT $x.y
or $x[0]
).You can execute a set of statements by putting them in
$
(e.g. $($x.y)
or $(Get-Process | Out-String)
).In the following PowerShell code, text is passed to a function using a here-string.
The function
ConvertTo-UpperCase
is defined as follows:PS> function ConvertTo-UpperCase($string) { $string.ToUpper }
PS> ConvertTo-UpperCase @'
>> one two three
>> eins zwei drei
>> '@
>>
ONE TWO THREE
EINS ZWEI DREI
Here is an example that demonstrates variable interpolation and statement execution using a here-string with double quotes:
$doc, $marty = 'Dr. Emmett Brown', 'Marty McFly'
$time = [DateTime]'Friday, October 25, 1985 8:00:00 AM'
$diff = New-TimeSpan -Minutes 25
@"
$doc : Are those my clocks I hear?
$marty : Yeah! Uh, it's $($time.Hour) o'clock!
$doc : Perfect! My experiment worked! They're all exactly $($diff.Minutes) minutes slow.
$marty : Wait a minute. Wait a minute. Doc... Are you telling me that it's $(($time + $diff).ToShortTimeString)?
$doc : Precisely.
$marty : Damn! I'm late for school!
"@
Output:
Dr. Emmett Brown : Are those my clocks I hear?
Marty McFly : Yeah! Uh, it's 8 o'clock!
Dr. Emmett Brown : Perfect! My experiment worked! They're all exactly 25 minutes slow.
Marty McFly : Wait a minute. Wait a minute. Doc... Are you telling me that it's 08:25?
Dr. Emmett Brown : Precisely.
Marty McFly : Damn! I'm late for school!
Using a here-string with single quotes instead, the output would look like this:
$doc : Are those my clocks I hear?
$marty : Yeah! Uh, it's $($time.Hour) o'clock!
$doc : Perfect! My experiment worked! They're all exactly $($diff.Minutes) minutes slow.
$marty : Wait a minute. Wait a minute. Doc... Are you telling me that it's $(($time + $diff).ToShortTimeString)?
$doc : Precisely.
$marty : Damn! I'm late for school!
D
Since version 2.0, DD (programming language)
The D programming language is an object-oriented, imperative, multi-paradigm, system programming language created by Walter Bright of Digital Mars. It originated as a re-engineering of C++, but even though it is mainly influenced by that language, it is not a variant of C++...
has support for delimiter strings using the 'q' prefix character. These strings begin and end with a delimiter character (any of <> {} or []), or an identifier immediately followed by a newline.
The following D code shows 2 examples using delimiter characters and an identifier.
Using an identifier.
Perl
In Perl there are several different ways to invoke here docs. Using the delimiter around the tag has the same effect on the here doc as the delimiter on a regular string literal: Using double quotes around the tag allows variables to be interpolated, using single quotes doesn't, and using the tag without either behaves like double quotes. Using backticks as the delimiter runs the contents of the heredoc as a shell script. It is necessary to make sure that the end tag is at the beginning of the line or the tag will not be recognized by the interpreter.Note that the here doc does not start at the tag -- but rather starts on the next line. So the statement containing the tag continues on after the tag.
Here is an example with double quotes:
Output:
Dear Spike,
I wish you to leave Sunnydale and never return.
Not Quite Love,
Buffy the Vampire Slayer
Here is an example with single quotes:
Output:
Dear $recipient,
I wish you to leave Sunnydale and never return.
Not Quite Love,
$sender
And an example with backticks (may not be portable):
PHP
In PHP, here documents are referred to as heredocs.Outputs
This is a heredoc section.
For more information talk to Joe Smith, your local Programmer.
Thanks!
hey joe smith! you can actually assign the heredoc section to a variable!
The line containing the closing identifier must not contain any other characters, except an optional ending semicolon. Otherwise, it will not be considered to be a closing identifier, and PHP will continue looking for one. If a proper closing identifier is not found, a parse error will result with the line number being at the end of the script.
In PHP 5.3 and later, like Perl, it is possible to not interpolate variables by surrounding the tag with single quotes; this is called a nowdoc:
In PHP 5.3+ it is also possible to surround the tag with double quotes, which like Perl has the same effect as not surrounding the tag with anything at all.
Python
PythonPython (programming language)
Python is a general-purpose, high-level programming language whose design philosophy emphasizes code readability. Python claims to "[combine] remarkable power with very clear syntax", and its standard library is large and comprehensive...
supports string literals delimited by single or double quotes repeated three times (i.e.
or """
). These string literals can span multiple lines and support the functionality of here documents.A simple Python 3 compatible example that yields the same result as the first Perl example above, is:
Replace the call to the
print
function with the keyword print
in Python versions older than 3.0. The Template
class described in PEP 292 (Simpler String Substitutions) provides similar functionality for variable interpolation and may be used in combination with the Python triple-quotes syntax.Racket
Racket's here strings start with#<<
followed by characters that define a terminator for the string.The content of the string includes all characters between the
#<<
line and a line whose only content is the specified terminator. More precisely, the content of the string starts after a newline following #<<
, and it ends before a newline that is followed by the terminator.Outputs:
This is a simple here string in Racket.
* One
* Two
* Three
No escape sequences are recognized between the starting and terminating lines; all characters are included in the string (and terminator) literally.
Outputs:
This string spans for multiple lines
and can contain any Unicode symbol.
So things like λ, ☠, α, β, are all fine.
In the next line comes the terminator. It can contain any Unicode symbol as well, even spaces and smileys!
Here strings can be used normally in contexts where normal strings would:
Outputs:
Dear Isaac,
Thanks for the insightful conversation yesterday.
Carl
An interesting alternative is to use the language extension
at-exp
to write @-expressions.They look like this:
Outputs:
This is a long string,
very convenient when a
long chunk of text is
needed.
No worries about escaping
"quotes". It's also okay
to have λ, γ, θ, ...
Embed code: 7
An @-expression is not specific nor restricted to strings, it is a syntax form that can be composed with the rest of the language.
Ruby
The following Ruby code displays a grocery list by using a here document.The result:
$ ruby grocery-list.rb
Grocery list
------------
1. Salad mix.
2. Strawberries.*
3. Cereal.
4. Milk.*
* Organic
Writing to a file with a here document involves a profusion of chevron (
'<'
) symbols:Ruby also allows for the delimiting identifier not to start on the first column of a line, if the start of the here document is marked with the slightly different starter "<<-".
Besides, Ruby treats here documents as a double-quoted string, and as such, it is possible to use the #{} construct to interpolate code.
The following example illustrates both of these features :
Ruby also allows for starting multiple here documents in one line:
Tcl
TclTcl
Tcl is a scripting language created by John Ousterhout. Originally "born out of frustration", according to the author, with programmers devising their own languages intended to be embedded into applications, Tcl gained acceptance on its own...
has no special syntax for heredocs, because the ordinary string syntaxes already allow embedded newlines and preserve indentation. Brace-delimited strings have no substitution (interpolation):
Quote-delimited strings are substituted at runtime:
In brace-delimited strings, there is the restriction that they must be balanced with respect to unescaped braces. In quote-delimited strings, braces can be unbalanced but backslashes, dollar signs, and left brackets all trigger substitution, and the first unescaped double quote terminates the string.
A point to note is that both the above strings have a newline as first and last character, since that is what comes immediately after and before respectively the delimiters.
string trim
can be used to remove these if they are unwanted:Similarly,
string map
can be used to effectively set up variant syntaxes, e.g. undoing a certain indentation or introducing nonstandard escape sequences to achieve unbalanced braces.Microsoft NMAKE
In Microsoft NMAKE, here documents are referred to as inline files. Inline files are referenced as<<
or <<pathname
: the first notation creates a temporary file, the second notation creates (or overwrites) the file with the specified pathname.An inline file is terminated with
<<
on a line by itself, optionally followed by the (case-insensitive) keyword KEEP
or NOKEEP
to indicate whether the created file should be kept.See also
- tr (program) for information about tr(1)
- Pipeline (Unix)Pipeline (Unix)In Unix-like computer operating systems , a pipeline is the original software pipeline: a set of processes chained by their standard streams, so that the output of each process feeds directly as input to the next one. Each connection is implemented by an anonymous pipe...
for information about pipes - String literalString literalA string literal is the representation of a string value within the source code of a computer program. There are numerous alternate notations for specifying string literals, and the exact notation depends on the individual programming language in question...
- DocstringDocstringIn programming, a docstring is a string literal specified in source code that is used, like a comment, to document a specific segment of code. Unlike conventional source code comments, or even specifically formatted comments like Javadoc documentation, docstrings are not stripped from the source...
External links
- Here document. Link to Rosetta CodeRosetta CodeRosetta Code is a wiki-based programming chrestomathy website with solutions to various programming problems in many different programming languages. It was created in 2007 by Mike Mol. Rosetta Code includes 450 programming tasks, and covers 351 programming languages...
task with examples of here documents in over 15 languages.