Data Interchange Format
Encyclopedia
Data Interchange Format is a text file
format
used to import/export single spreadsheet
s between spreadsheet programs (OpenOffice.org Calc
, Excel
, Gnumeric
, StarCalc, Lotus 1-2-3
, FileMaker
, dBase
, Framework
, Multiplan
, etc.). It is also known as "Navy DIF". One limitation is that DIF format cannot handle multiple spreadsheets in a single workbook.
program) in the early 1980s. The specification was included in many copies of VisiCalc, and published in Byte Magazine. Bob Frankston
developed the format, with input from others, including Mitch Kapor
, who helped so that it could work with his VisiPlot program. (Mitch later went on to found Lotus
and make Lotus 1-2-3
happen.) The specification was copyright 1981.
DIF was a registered trademark of Software Arts Products Corp. (a legal name for Software Arts at the time).
text file to mitigate many cross-platform issues back in the days of its creation. However modern spreadsheet software, e.g. OpenOffice.org Calc
and Gnumeric
, offer more character encoding
to export/import. The file is divided into 2 sections: header and data. Everything in DIF is represented by a 2- or 3-line chunk. Headers get a 3-line chunk; data, 2. Header chunks start with a text identifier that is all caps, only alphabetic characters, and less than 32 letters. The following line must be a pair of numbers, and the third line must be a quoted string. On the other hand, data chunks start with a number pair and the next line is a quoted string or a keyword.
The first number of the pair indicates type:
The numeric values in header chunks use just an empty string instead of the validity keywords.
In a .dif file, this would be:
Text file
A text file is a kind of computer file that is structured as a sequence of lines of electronic text. A text file exists within a computer file system...
format
File format
A file format is a particular way that information is encoded for storage in a computer file.Since a disk drive, or indeed any computer storage, can store only bits, the computer must have some way of converting information to 0s and 1s and vice-versa. There are different kinds of formats for...
used to import/export single spreadsheet
Spreadsheet
A spreadsheet is a computer application that simulates a paper accounting worksheet. It displays multiple cells usually in a two-dimensional matrix or grid consisting of rows and columns. Each cell contains alphanumeric text, numeric values or formulas...
s between spreadsheet programs (OpenOffice.org Calc
OpenOffice.org Calc
OpenOffice.org Calc is the spreadsheet component of the OpenOffice.org software package.Calc is similar to Microsoft Excel, with a roughly equivalent range of features. Calc is capable of opening and saving most spreadsheets in Microsoft Excel file format...
, Excel
Microsoft Excel
Microsoft Excel is a proprietary commercial spreadsheet application written and distributed by Microsoft for Microsoft Windows and Mac OS X. It features calculation, graphing tools, pivot tables, and a macro programming language called Visual Basic for Applications...
, Gnumeric
Gnumeric
Gnumeric is a spreadsheet program that is part of the GNOME Free Software Desktop Project. Gnumeric version 1.0 was released December 31, 2001. Gnumeric is distributed as free software under the GNU GPL license; it is intended to replace proprietary and other spreadsheet programs such as Microsoft...
, StarCalc, Lotus 1-2-3
Lotus 1-2-3
Lotus 1-2-3 is a spreadsheet program from Lotus Software . It was the IBM PC's first "killer application"; its huge popularity in the mid-1980s contributed significantly to the success of the IBM PC in the corporate environment.-Beginnings:...
, FileMaker
FileMaker
FileMaker Pro is a cross-platform relational database application from FileMaker Inc., formerly Claris, a subsidiary of Apple Inc. It integrates a database engine with a GUI-based interface, allowing users to modify the database by dragging new elements into layouts, screens, or forms...
, dBase
DBASE
dBase II was the first widely used database management system for microcomputers. It was originally published by Ashton-Tate for CP/M, and later on ported to the Apple II and IBM PC under DOS...
, Framework
Framework (office suite)
Framework, launched in 1984, was the first office suite to run on the PC 8086 with DOS operating system. ValDocs, an even earlier integrated suite, actually comparable to the original Macintosh of 1984 and Apple Lisa of 1982 was produced by Epson, a complete integrated work station based on the...
, Multiplan
MultiPlan
Multiplan was an early spreadsheet program developed by Microsoft. Known initially by the code name "EP" , it was introduced in 1982 as a competitor for VisiCalc....
, etc.). It is also known as "Navy DIF". One limitation is that DIF format cannot handle multiple spreadsheets in a single workbook.
History
DIF was developed by Software Arts, Inc. (the developers of the VisiCalcVisiCalc
VisiCalc was the first spreadsheet program available for personal computers. It is often considered the application that turned the microcomputer from a hobby for computer enthusiasts into a serious business tool...
program) in the early 1980s. The specification was included in many copies of VisiCalc, and published in Byte Magazine. Bob Frankston
Bob Frankston
Robert M. Frankston is the co-creator with Dan Bricklin of the VisiCalc spreadsheet program and the co-founder of Software Arts, the company that developed it....
developed the format, with input from others, including Mitch Kapor
Mitch Kapor
Mitchell David Kapor is the founder of Lotus Development Corporation and the designer of Lotus 1-2-3. He is also a co-founder of the Electronic Frontier Foundation and was the first chair of the Mozilla Foundation...
, who helped so that it could work with his VisiPlot program. (Mitch later went on to found Lotus
Lotus Software
Lotus Software is a software company with headquarters in Westford, Massachusetts...
and make Lotus 1-2-3
Lotus 1-2-3
Lotus 1-2-3 is a spreadsheet program from Lotus Software . It was the IBM PC's first "killer application"; its huge popularity in the mid-1980s contributed significantly to the success of the IBM PC in the corporate environment.-Beginnings:...
happen.) The specification was copyright 1981.
DIF was a registered trademark of Software Arts Products Corp. (a legal name for Software Arts at the time).
Syntax
DIF stores everything in an ASCIIASCII
The American Standard Code for Information Interchange is a character-encoding scheme based on the ordering of the English alphabet. ASCII codes represent text in computers, communications equipment, and other devices that use text...
text file to mitigate many cross-platform issues back in the days of its creation. However modern spreadsheet software, e.g. OpenOffice.org Calc
OpenOffice.org Calc
OpenOffice.org Calc is the spreadsheet component of the OpenOffice.org software package.Calc is similar to Microsoft Excel, with a roughly equivalent range of features. Calc is capable of opening and saving most spreadsheets in Microsoft Excel file format...
and Gnumeric
Gnumeric
Gnumeric is a spreadsheet program that is part of the GNOME Free Software Desktop Project. Gnumeric version 1.0 was released December 31, 2001. Gnumeric is distributed as free software under the GNU GPL license; it is intended to replace proprietary and other spreadsheet programs such as Microsoft...
, offer more character encoding
Character encoding
A character encoding system consists of a code that pairs each character from a given repertoire with something else, such as a sequence of natural numbers, octets or electrical pulses, in order to facilitate the transmission of data through telecommunication networks or storage of text in...
to export/import. The file is divided into 2 sections: header and data. Everything in DIF is represented by a 2- or 3-line chunk. Headers get a 3-line chunk; data, 2. Header chunks start with a text identifier that is all caps, only alphabetic characters, and less than 32 letters. The following line must be a pair of numbers, and the third line must be a quoted string. On the other hand, data chunks start with a number pair and the next line is a quoted string or a keyword.
Values
A value occupies two lines, the first a pair of numbers and the second either a string or a keyword.The first number of the pair indicates type:
- −1 – directive type, the second number is ignored, the following line is one of these keywords:
- BOT – beginning of tuple (start of row)
- EOD – end of data
- 0 – numeric type, value is the second number, the following line is one of these keywords:
- V – valid
- NA – not available
- ERROR – error
- TRUE – true boolean value
- FALSE – false boolean value
- 1 – string type, the second number is ignored, the following line is the string in double quotes
Header chunk
A header chunk is composed of an identifier line followed by the two lines of a value.- TABLE - a numeric value follows of the version, the disused second line of the value contains a generator comment
- VECTORS - the number of columns follows as a numeric value
- TUPLES - the number of rows follows as a numeric value
- DATA - after a dummy 0 numeric value, the data for the table follow, each row preceded by a BOT value, the entire table terminated by an EOD value
The numeric values in header chunks use just an empty string instead of the validity keywords.
Discrepancies in implementations
Some implementations (notably those of older Microsoft products) swapped the meaning of VECTORS and TUPLES. Some implementations are insensitive to errors in the dimensions of the table as written in the header and simply use the layout in the DATA section.Example
For example, assume we have two columns with one column header row and two data rows:Text | Number |
---|---|
hello | 1 |
has a double quote " in text | -3 |
In a .dif file, this would be:
TABLE
0,1
"EXCEL"
VECTORS
0,2
""
TUPLES
0,3
""
DATA
0,0
""
-1,0
BOT
1,0
"Text"
1,0
"Number"
-1,0
BOT
1,0
"hello"
0,1
V
-1,0
BOT
1,0
"has a double quote "" in text"
0,-3
V
-1,0
EOD
External links
- Article on Navy DIF
- Announcement of DIF Clearinghouse by Software Arts Products Corp.