Open Packaging Convention
Encyclopedia
The Open Packaging Conventions (OPC) is a container-file technology initially created by Microsoft
to store a combination of XML
and non-XML files that together form a single entity such as an Open XML Paper Specification (OpenXPS) document. OPC-based file formats combine the advantages of leaving the independent file entities embedded in the document intact and resulting in much smaller files compared to normal use of XML.
/IEC
29500:2008 and ECMA
-376.
The ISO/IEC 29500-2:2008 specification (and the 2nd edition of ECMA-376) makes a normative reference to PKWARE Inc's .ZIP File Format Specification version 6.2.0 (2004), and supplements it with a normative set of clarifications. Note: the older 1st edition of ECMA-376 makes an informative (i.e. non-normative) reference to the newer PKWARE Inc's ".ZIP File Format Specification" version 6.2.1(2005). The ZIP format is not specified by any international standard, but has widespread community and developer acceptance.
(XPS) and Office Open XML (OOXML) use Open Packaging Conventions (OPC), which provide a profile of the common ZIP
format. In addition to XML data and document, files in the ZIP package can include other text and binary files in formats such as PNG, BMP, AVI
, PDF
, RTF
, or even an already packaged ODF
file. OPC also defines some naming conventions and an indirection method to allow position independence of binary and XML files in the ZIP archive.
OPC files can be opened using common ZIP utilities.
. A part's content-type explicitly defines the type of data stored in the part, and reduces duplication and ambiguity issues inherent with file extensions.
OPC packages can also include relationships that define associations between the package, parts, and external resources. In addition to a hierarchy of directories and parts, OPC packages commonly use relationships to access content through a directed graph
of relationship associations. Relationships are composed of four elements:
OPC packages can store parts that contain any type of data (text, images, XML, binary, whatever). The extension ".rels", however, is reserved for storing relationships metadata within "/_rels" subfolders. The subfolder name "_rels", the file extension ".rels" within such directory, and the filename "[Content_Types].xml" in any folder are the only three reserved names for files stored in an OPC package.
All relationships (including the relations associated to the root package) are represented as XML files. If you open a ".rels" file in a text editor, you can view the actual XML markup that defines all the relationships targeted from that part. A typical relationships file contains XML code like this:
which defines two relations for the root package, the first one being considered as the root package (here for an early Microsoft XPS document, before it was standardized as Open XML Paper Specification within the openxmlformats collection), and the other one being used to reference an alternate form (here a thumbnail rendered image of the first page of the document).
The main parts of the embedded documents are often stored within a folder named "/Document" (which may contain subdirectories itself, if the file contains several related documents each of them with various parts), and the optional metadata parts that are not needed for processing the main parts of the document are stored in a folder named "/Metadata" ; however these actual folder names are actually specified within the XML-formatted data in "[partname].rels" relationship files, and the OPC specification allows any folder organisation that is convenient for the application and these two folder names are not required.
Microsoft has submitted a draft in 2006 to the Internet Engineering Task Force
for a "pack" URI Scheme
(
3.0 by the System.IO.Packaging namespace. Open source libraries exist for other languages.
Alternatively, ZIP libraries can be used to create and open OPC files, as long as the correct files are included in the ZIP and the conventions followed.
Microsoft
Microsoft Corporation is an American public multinational corporation headquartered in Redmond, Washington, USA that develops, manufactures, licenses, and supports a wide range of products and services predominantly related to computing through its various product divisions...
to store a combination of XML
XML
Extensible Markup Language is a set of rules for encoding documents in machine-readable form. It is defined in the XML 1.0 Specification produced by the W3C, and several other related specifications, all gratis open standards....
and non-XML files that together form a single entity such as an Open XML Paper Specification (OpenXPS) document. OPC-based file formats combine the advantages of leaving the independent file entities embedded in the document intact and resulting in much smaller files compared to normal use of XML.
Specifications
The OPC is specified in Part 2 of the Office Open XML standards ISOInternational Organization for Standardization
The International Organization for Standardization , widely known as ISO, is an international standard-setting body composed of representatives from various national standards organizations. Founded on February 23, 1947, the organization promulgates worldwide proprietary, industrial and commercial...
/IEC
International Electrotechnical Commission
The International Electrotechnical Commission is a non-profit, non-governmental international standards organization that prepares and publishes International Standards for all electrical, electronic and related technologies – collectively known as "electrotechnology"...
29500:2008 and ECMA
Ecma International
Ecma International is an international, private non-profit standards organization for information and communication systems. It acquired its name in 1994, when the European Computer Manufacturers Association changed its name to reflect the organization's global reach and activities...
-376.
The ISO/IEC 29500-2:2008 specification (and the 2nd edition of ECMA-376) makes a normative reference to PKWARE Inc's .ZIP File Format Specification version 6.2.0 (2004), and supplements it with a normative set of clarifications. Note: the older 1st edition of ECMA-376 makes an informative (i.e. non-normative) reference to the newer PKWARE Inc's ".ZIP File Format Specification" version 6.2.1(2005). The ZIP format is not specified by any international standard, but has widespread community and developer acceptance.
Usage
Both the XML Paper SpecificationXML Paper Specification
Open XML Paper Specification , is an open specification for a page description language and a fixed-document format originally developed by Microsoft as XML Paper Specification that was later standardized by Ecma International as international standard ECMA-388...
(XPS) and Office Open XML (OOXML) use Open Packaging Conventions (OPC), which provide a profile of the common ZIP
ZIP (file format)
Zip is a file format used for data compression and archiving. A zip file contains one or more files that have been compressed, to reduce file size, or stored as is...
format. In addition to XML data and document, files in the ZIP package can include other text and binary files in formats such as PNG, BMP, AVI
Audio Video Interleave
Audio Video Interleave , known by its acronym AVI, is a multimedia container format introduced by Microsoft in November 1992 as part of its Video for Windows technology. AVI files can contain both audio and video data in a file container that allows synchronous audio-with-video playback...
Portable Document Format
Portable Document Format is an open standard for document exchange. This file format, created by Adobe Systems in 1993, is used for representing documents in a manner independent of application software, hardware, and operating systems....
, RTF
Rich Text Format
The Rich Text Format is a proprietary document file format with published specification developed by Microsoft Corporation since 1987 for Microsoft products and for cross-platform document interchange....
, or even an already packaged ODF
OpenDocument
The Open Document Format for Office Applications is an XML-based file format for representing electronic documents such as spreadsheets, charts, presentations and word processing documents....
file. OPC also defines some naming conventions and an indirection method to allow position independence of binary and XML files in the ZIP archive.
OPC files can be opened using common ZIP utilities.
File formats using the OPC
The OPC is the foundation technology for many new file formats:File format | Filename extension Filename extension A filename extension is a suffix to the name of a computer file applied to indicate the encoding of its contents or usage.... |
Content | Standard |
---|---|---|---|
Autodesk Autodesk Autodesk, Inc. is an American multinational corporation that focuses on 3D design software for use in the architecture, engineering, construction, manufacturing, media and entertainment industries. The company was founded in 1982 by John Walker, a coauthor of the first versions of the company's... AutoCAD AutoCAD AutoCAD is a software application for computer-aided design and drafting in both 2D and 3D. It is developed and sold by Autodesk, Inc. First released in December 1982, AutoCAD was one of the first CAD programs to run on personal computers, notably the IBM PC... DWFX file format |
.dwfx | CAD Computer-aided design Computer-aided design , also known as computer-aided design and drafting , is the use of computer technology for the process of design and design-documentation. Computer Aided Drafting describes the process of drafting with a computer... design data (2D 2D computer graphics 2D computer graphics is the computer-based generation of digital images—mostly from two-dimensional models and by techniques specific to them... /3D 3D computer graphics 3D computer graphics are graphics that use a three-dimensional representation of geometric data that is stored in the computer for the purposes of performing calculations and rendering 2D images... computer graphics Computer graphics Computer graphics are graphics created using computers and, more generally, the representation and manipulation of image data by a computer with help from specialized software and hardware.... s and technical drawing Technical drawing Technical drawing, also known as drafting or draughting, is the act and discipline of composing plans that visually communicate how something functions or has to be constructed.Drafting is the language of industry.... s) |
|
Family.Show file format | .familyx | genealogical Genealogy Genealogy is the study of families and the tracing of their lineages and history. Genealogists use oral traditions, historical records, genetic analysis, and other records to obtain information about a family and to demonstrate kinship and pedigrees of its members... family data, stories, and photos |
|
Field Device Integration FDI Packages | .fdix | ||
Microsoft Semblio Microsoft Semblio Microsoft Semblio is a software development kit for development of rich educational software. Version 1.0 was published by Microsoft on 15 December 2008.... file format |
.semblio | interactive learning material, such as e-books containing images, audio, and video | |
Microsoft Visual Studio Microsoft Visual Studio Microsoft Visual Studio is an integrated development environment from Microsoft. It is used to develop console and graphical user interface applications along with Windows Forms applications, web sites, web applications, and web services in both native code together with managed code for all... 2010 Extensions file format |
.vsix | integrated development environment Integrated development environment An integrated development environment is a software application that provides comprehensive facilities to computer programmers for software development... extension |
|
Microsoft Windows 8 Windows 8 Windows 8 is the codename for the next version of the Microsoft Windows computer operating system following Windows 7. It has many changes from previous versions. In particular it adds support for ARM microprocessors in addition to the previously supported x86 microprocessors from Intel and AMD... App Package |
.appx | software package Software package (installation) In package management systems, which are commonly used with Linux-based operating systems, a package is a specific piece of software which the system can install and uninstall.... for the planned Windows Store Windows 8 Windows 8 is the codename for the next version of the Microsoft Windows computer operating system following Windows 7. It has many changes from previous versions. In particular it adds support for ARM microprocessors in addition to the previously supported x86 microprocessors from Intel and AMD... |
|
Microsoft Windows Azure C# Package | .cspkg | cloud platform data | |
Microsoft XML Paper Specification XML Paper Specification Open XML Paper Specification , is an open specification for a page description language and a fixed-document format originally developed by Microsoft as XML Paper Specification that was later standardized by Ecma International as international standard ECMA-388... |
.xps | fixed document for document exchange | |
NuGet NuGet NuGet is a package manager for the .NET Framework. NuGet was formerly known as NuPack, however the name was changed to avoid confusion with an existing software package called NUPACK.- External links :* * * *... Package |
.nupkg | software package Software package (installation) In package management systems, which are commonly used with Linux-based operating systems, a package is a specific piece of software which the system can install and uninstall.... for a package management system Package management system In software, a package management system, also called package manager, is a collection of software tools to automate the process of installing, upgrading, configuring, and removing software packages for a computer's operating system in a consistent manner... |
|
Office Open XML Document | .docx | word processing Word processing Word processing is the creation of documents using a word processor. It can also refer to advanced shorthand techniques, sometimes used in specialized contexts with a specially modified typewriter.-External links:... document |
ECMA-376, ISO/IEC 29500:2008 |
Office Open XML Presentation | .pptx | presentation Presentation Presentation is the practice of showing and explaining the content of a topic to an audience or learner. Presentations come in nearly as many forms as there are life situations... |
ECMA-376, ISO/IEC 29500:2008 |
Office Open XML Workbook | .xlsx | spreadsheet Spreadsheet A spreadsheet is a computer application that simulates a paper accounting worksheet. It displays multiple cells usually in a two-dimensional matrix or grid consisting of rows and columns. Each cell contains alphanumeric text, numeric values or formulas... workbook |
ECMA-376, ISO/IEC 29500:2008 |
Open XML Paper Specification | .oxps | fixed document for document exchange | ECMA-388 |
Siemens PLM Software Siemens PLM Software Siemens PLM Software is a computer software company specializing in 3D & 2D Product Lifecycle Management software. The company is a business unit of Siemens Industry Automation division, and is headquartered in Plano, Texas.... file format |
.jtx | ||
SMPTE Media Package | .smpk | Storage format for distribution and playback of multimedia video and audio files. | SMPTE ST 2053-2011 |
Package, Parts, and Relationships
In OPC terminology, the term package corresponds to a ZIP archive and the term part corresponds to a file stored within the ZIP. Every part in a package has a unique URI-compliant part name along with a specified content-type expressed in the form of a MIME media typeInternet media type
An Internet media type, originally called a MIME type after MIME and sometimes a Content-type after the name of a header in several protocols whose value is such a type, is a two-part identifier for file formats on the Internet.The identifiers were originally defined in RFC 2046 for use in email...
. A part's content-type explicitly defines the type of data stored in the part, and reduces duplication and ambiguity issues inherent with file extensions.
OPC packages can also include relationships that define associations between the package, parts, and external resources. In addition to a hierarchy of directories and parts, OPC packages commonly use relationships to access content through a directed graph
Directed graph
A directed graph or digraph is a pair G= of:* a set V, whose elements are called vertices or nodes,...
of relationship associations. Relationships are composed of four elements:
- an identifier (ID)
- an optional source (the package or a part within the package)
- a relationship type (a URI-style expression that defines the type of the relationship)
- a target (a URIÚriÚriis a village and commune in the comitatus of Pest in Hungary....
to another part within the package or to an external resource)
OPC packages can store parts that contain any type of data (text, images, XML, binary, whatever). The extension ".rels", however, is reserved for storing relationships metadata within "/_rels" subfolders. The subfolder name "_rels", the file extension ".rels" within such directory, and the filename "[Content_Types].xml" in any folder are the only three reserved names for files stored in an OPC package.
-
- /[Content_Types].xml file
- This file defines the MIME media typesInternet media typeAn Internet media type, originally called a MIME type after MIME and sometimes a Content-type after the name of a header in several protocols whose value is such a type, is a two-part identifier for file formats on the Internet.The identifiers were originally defined in RFC 2046 for use in email...
for all the parts stored in the package. The "/[Content_Types].xml" file defines default mappings based on file extensions, along with overrides for specific parts with content-types that are different from the file extension defaults. For example, one of these defined MIME types is: -
- /_rels
- The root level "/_rels" folder stores the relationships for the package as a whole. The "/_rels" folder normally contains a file named ".rels". "/_rels/.rels" is an XML file where the starting package-level relationships are stored. Normally when opening an OPC-based file, applications start by accessing to the "/_rels/.rels" file to read the starting package-level relationships.
- [partname].rels
- Each part may have its own relationships. The_rels folders are where one goes to find the relationships for any given part within the package. To find the relationships for a specific part, one looks in the "_rels" folder that is a sibling of that part: If the part has relationships, the "_rels" folder will contain a file that has one's original part name with a ".rels" appended to it. For example, if the content types part file had any relationships, there would be a file called "[Content_Types].xml.rels" inside the "/_rels" folder.
All relationships (including the relations associated to the root package) are represented as XML files. If you open a ".rels" file in a text editor, you can view the actual XML markup that defines all the relationships targeted from that part. A typical relationships file contains XML code like this:
which defines two relations for the root package, the first one being considered as the root package (here for an early Microsoft XPS document, before it was standardized as Open XML Paper Specification within the openxmlformats collection), and the other one being used to reference an alternate form (here a thumbnail rendered image of the first page of the document).
The main parts of the embedded documents are often stored within a folder named "/Document" (which may contain subdirectories itself, if the file contains several related documents each of them with various parts), and the optional metadata parts that are not needed for processing the main parts of the document are stored in a folder named "/Metadata" ; however these actual folder names are actually specified within the XML-formatted data in "[partname].rels" relationship files, and the OPC specification allows any folder organisation that is convenient for the application and these two folder names are not required.
Microsoft has submitted a draft in 2006 to the Internet Engineering Task Force
Internet Engineering Task Force
The Internet Engineering Task Force develops and promotes Internet standards, cooperating closely with the W3C and ISO/IEC standards bodies and dealing in particular with standards of the TCP/IP and Internet protocol suite...
for a "pack" URI Scheme
Uniform Resource Identifier
In computing, a uniform resource identifier is a string of characters used to identify a name or a resource on the Internet. Such identification enables interaction with representations of the resource over a network using specific protocols...
(
pack://
) to be used for URI references to OPC-based packages.. The draft has last been revised in February 2009.Chunking
It encourages documents to be split into small chunks. This is better for reducing the effect of file corruption. And better for data access: for example, all the style information in one XML part, each separate worksheet or table in their own different parts. This allows faster access and less object creation for clients, and makes it easier for multiple processes to be working on the same document.Relative indirection
In the Open Packaging Conventions each file that has reference has its own _rels file with the indirection lists. This makes it easier to cut and paste some information with all its associated resources in some cases, provides name scoping to remove the chance of name clashing between files, and so on.Programming
OPC is natively supported in Microsoft .NET Framework.NET Framework
The .NET Framework is a software framework that runs primarily on Microsoft Windows. It includes a large library and supports several programming languages which allows language interoperability...
3.0 by the System.IO.Packaging namespace. Open source libraries exist for other languages.
Alternatively, ZIP libraries can be used to create and open OPC files, as long as the correct files are included in the ZIP and the conventions followed.
External links
- Download specification ISO/IEC 29500-2:2008
- Download Electronic inserts for ISO/IEC 29500-2:2008
- OPC: A New Standard for Packaging Your Data
- Essentials of the Open Packaging Conventions
- OPC Digital Signatures: Application Guidelines for Common Criteria Security
- Packaging team blog
- Open Packaging Conventions (OPC) MSDN Forum
- The Addressing Model of the Open Packaging Conventions
- OPC implementation test documents
- An OPC package explorer that allows you to edit XML parts.