DirectShow
Encyclopedia
DirectShow codename Quartz, is a multimedia framework
and API
produced by Microsoft
for software developer
s to perform various operations with media files or streams. It is the replacement for Microsoft's earlier Video for Windows
technology. Based on the Microsoft Windows
Component Object Model
(COM) framework, DirectShow provides a common interface for media across various programming language
s, and is an extensible, filter
-based framework that can render or record media files on demand at the request of the user or developer. The DirectShow development tools and documentation were originally distributed as part of the DirectX
SDK. Currently, they are distributed as part of the Windows SDK (formerly known as the Platform SDK).
DirectShow's counterparts on other platforms include Apple's QuickTime
framework and various Linux multimedia frameworks such as GStreamer
or Xine
. Microsoft plans to completely replace DirectShow gradually with Media Foundation
in future Windows versions. Windows Vista
and Windows 7 applications use Media Foundation instead of DirectShow for several media related tasks.
(codenamed Quartz), was originally chartered to provide MPEG-1
file playback support for Windows. It was also intended as a future replacement for media processing frameworks like Video for Windows
, which had never been designed to handle codec
s that put video frames into a different order during the compression process, and the Media Control Interface
, which had never been fully ported to a 32-bit environment and did not utilize COM.
The Quartz team started with an existing project called Clockwork. Clockwork was a modular media processing framework in which semi-independent components worked together to process digital media streams, and had previously been used in several projects, including the Microsoft Interactive Television (MITV) project and another project named Tiger.
ActiveMovie was announced in March 1996, and released in May 1996, bundled with the beta version of Internet Explorer 3
.0. In March 1997, Microsoft announced that ActiveMovie would become part of the DirectX
5 suite of technologies, and around July started referring to it as DirectShow, reflecting Microsoft's efforts at the time to consolidate technologies that worked directly with hardware under a common naming scheme. DirectShow became a standard component of all Windows operating systems starting with Windows 98
; however it is available on Windows 95
by installing the latest available DirectX redistributable. In DirectX version 8.0, DirectShow became part of the mainline distribution of the DirectX SDK and was placed alongside other DirectX APIs.
In October 2004, DirectShow was removed from the main DirectX distribution and relocated to the DirectX Extras download. In April 2005, DirectShow was removed entirely from DirectX and moved to the Windows Server 2003 SP1 version of the Microsoft Platform SDK. The DirectX SDK was, however, still required to build some of the DirectShow samples.
Since November 2007, DirectShow APIs are part of the Windows SDK. It includes several new enhancements, codecs and filter updates such as the Enhanced Video Renderer (EVR) and DXVA 2.0 (DirectX Video Acceleration).
. Each filter — which represents one stage in the processing of the data — has input and/or output pins that may be used to connect the filter to other filters. The generic nature of this connection mechanism enables filters to be connected in various ways so as to implement different complex functions. To implement a specific complex task, a developer must first build a filter graph
by creating instances of the required filters, and then connecting the filters together.
There are three main types of filters:
During the rendering process, the filter graph searches the Windows Registry
for registered filters and builds its graph of filters based on the locations provided. After this, it connects the filters together, and, at the developer's request, executes (i.e., plays, pauses, etc.) the created graph. DirectShow filter graphs are widely used in video playback (in which the filters implement functions such as file parsing, video and audio demultiplexing, decompressing and rendering) as well as for video and audio recording, editing, encoding, transcoding or network transmission of media. Interactive tasks such as DVD navigation may also be controlled by DirectShow.
In the above example, from left to right, the graph contains a source filter to read an MP3 file, stream splitter and decoder filters to parse and decode the audio, and a rendering filter to play the raw audio samples. Each filter has one or more pins that can be used to connect that filter to other filters. Every pin functions either as an output or input source for data to flow from one filter to another. Depending on the filter, data is either "pulled" from an input pin or "pushed" to an output pin in order to transfer data between filters. Each pin can only connect to one other pin and they have to agree on what kind of data they are sending.
Most filters are built using a set of C++ classes provided in the DirectShow SDK, called the DirectShow Base Classes. These handle much of the creation, registration and connection logic for the filter. For the filter graph to use filters automatically, they need to be registered in a separate DirectShow registry entry as well as being registered with COM. This registration can be managed by the DirectShow Base Classes. However, if the application adds the filters manually, they do not need to be registered at all.
Unfortunately, it is difficult to modify a graph that is already running. It is usually easier to stop the graph and create a new graph from scratch. Starting with DirectShow 8.0, dynamic graph building, dynamic reconnection, and filter chains were introduced to help alter the graph while it was running. However, many filter vendors ignore this feature, making graph modification problematic after a graph has begun processing.
, MP3
, Windows Media Audio
, Windows Media Video
, MIDI
, media containers such as AVI
, ASF
, WAV
, some splitters/demultiplexers, multiplexers, source and sink
filters and some static image filters. Since the associated patented technologies are licensed in Windows, no license fees are required (e.g., to Fraunhofer
, for MP3). Some codecs such as MPEG-4 Advanced Simple Profile
, AAC
, H.264, Vorbis
and containers MOV
, MP4 are easily added from 3rd parties. Incorporating support for additional codecs such as these can involve paying the licensing fees to the involved codec technology developer or patent holder.
However, DirectShow's standard format repertoire can be easily expanded by means of a variety of filters. Such filters enable DirectShow to support virtually any container format and any audio or video codec. For example, filters have been developed for Ogg Vorbis, Musepack
, and AC3. Finally, there are "bridge" filters that simultaneously support multiple formats, as well as functions like stream multiplexing, by exposing the functionality of underlying multimedia APIs such as VLC
.
The amount of work required to implement a filter graph depends on several factors. In the simplest case, DirectShow can create a filter graph automatically from a source such as a file or URL. If this is not possible, the developer may be able to manually create a filter graph from a source file, possibly with the addition of a custom filter, and then let DirectShow complete the filter graph by connecting the filters together. At the next level, the developer must build the filter graph from scratch by manually adding and connecting each desired filter. Finally, in cases where an essential filter is unavailable, the developer must create a custom filter before a filter graph can be built.
Unlike the main C API of QuickTime where it is necessary to call MoviesTask in a loop to load a media file, DirectShow handles all of this in a transparent way. It creates several background threads that smoothly play the requested file or URL without much work required from the programmer. Also in contrast to QuickTime, nothing special is required for loading a URL instead of a local file on disk – DirectShow's filter graph abstracts these details from the programmer, although recent developments in QuickTime (including an ActiveX control) have reduced this disparity.
is an API targeted at video editing tasks and built on top of the core DirectShow architecture. DirectShow Editing Services was introduced for Microsoft's Windows Movie Maker
. It includes APIs for timeline and switching services, resizing, cropping, video and audio effects, as well as transitions, keying, automatic frame rate
and sample rate conversion
and such other features which are used in non-linear video editing allowing creation of composite media out of a number of source audio and video streams. DirectShow Editing Services allow higher-level run-time compositing, seeking support, and graph management, while still allowing applications to access lower-level DirectShow functions.
While the original API is in C++, DirectShow Editing Services is accessible in any Microsoft .NET compatible language including Microsoft Visual C#
and Microsoft Visual Basic
by using a third-party code library called "DirectShowNet Library". Alternatively, the entire DirectShow API, including DirectShow Editing Services, can be accessed from Borland Delphi
5, 6 and 7, C++ Builder
6, and from later versions with a few minor modifications, using a third party software library called "DSPack".
, DirectShow used the Video Renderer filter. This drew the images using DirectDraw
3, but could also fall back to GDI
or overlay drawing modes in some circumstances (depending upon the visibility of the video window and the video card's capabilities). It had limited access to the video window. Video for Windows
had been plagued with deadlock
s caused by applications' incorrect handling of the video windows, so in early DirectShow releases, the handle to the playback window was hidden from applications. There was also no reliable way to draw caption text or graphics on top of the video.
DirectShow 6.0, released as part of DirectX Media
introduced the Overlay Mixer renderer designed for DVD
playback and broadcast
video streams with closed captioning
and subtitles
. The Overlay Mixer uses DirectDraw 5 for rendering. Downstream connection with the Video Renderer is required for window management. Overlay Mixer also supports Video Port Extensions (VPE), enabling it to work with analog TV tuners
with overlay capability (sending video directly to a video card via an analog link rather than via the PCI bus
). Overlay Mixer also supports DXVA
connections. Because it always renders in overlay, full-screen video to TV-out
is always activated.
Windows XP
introduced a new filter called the Video Mixing Renderer 7 (VMR-7 or sometimes just referred to as VMR). The number 7 was because VMR-7 only used DirectDraw
version 7 to render the video and did not have the option to use GDI drawing. The main new feature of VMR-7 was the ability to mix multiple streams and graphics with alpha blending, allowing applications to draw text and graphics over the video and support custom effects. It also featured a "windowless mode" (access to the composited image before it is rendered) which fixed the problems with access to the window handle. VMR-7 was only officially released for Windows XP
.
DirectX 9 included VMR-9. This version uses Direct3D
9 instead of DirectDraw, allowing developers to transform video images using the Direct3D pixel shaders. It is available for all Windows platforms as part of the DirectX 9 redistributable. As VMR-7 it provides a Windowless Mode. However, unlike Overlay mixer or VMR-7 it does not support video ports.
Windows Vista
and Windows 7 ship with a new renderer, available as both a Media Foundation
component and a DirectShow filter, called the Enhanced Video Renderer (EVR). EVR is designed to work with Desktop Window Manager
and supports DXVA 2.0
, which is available on Windows Vista and Windows 7. It offers better performance and better quality according to Microsoft.
Developers rarely create DirectShow filters from scratch. Rather, they employ DirectShow Base Classes. The Base Classes can often simplify development, allowing the programmer to bypass certain tasks. However, the process may remain relatively complex; the code found in the Base Classes is nearly half the size of the entire MFC library
. As a result, even with the Base Classes, the number of COM objects that DirectShow contains often overwhelms developers. In some cases, DirectShow's API deviates from traditional COM rules, particularly with regard to the parameters used for methods
. To overcome their difficulties with DirectShow's unique COM rules, developers often turn to a higher level API that uses DirectShow, notably, Windows Media Player SDK, an API provides the developer with an ActiveX Control that has fewer COM interfaces to deal with.
Although DirectShow is capable of dynamically building a graph to render a given media type, in certain instances it is difficult for developers to rely on this functionality and they need to resort to manually building filter graphs if the resulting filter graph is variable. It is possible for filter graphs to change over time as new filters are installed on the computer.
(DRM); however, DirectShow itself has minimal support for DRM in its API. The Windows Media Player SDK more significantly reflects Microsoft's adherence to DRM.
) is when multiple DirectShow filters conflict for performing the same task. A large number of companies now develop codecs in the form of DirectShow filters, resulting in the presence of several filters that can decode the same media type. This issue is further exacerbated by DirectShow's merit system, where filter implementations end up competing with one another by registering themselves with increasingly elevated priority.
Microsoft's Ted Youmans explained that "DirectShow
was based on the merit system, with the idea being that, using a combination of the filter’s merit and how specific the media type/sub type is, one could reasonably pick the right codec every time. It wasn't really designed for a competing merit nuclear arms race."
A tool to help in the troubleshooting of "codec hell" issues usually referenced is the GSpot
Codec Information Appliance, which can be useful in determining what codec is used to render video files in AVI
and other containers. GraphEdit
can also help understanding the sequence of filters that DirectShow is using to render the media file. Codec hell can be resolved by manually building filter graphs, using a media player that supports ignoring or overriding filter merits, or by using a filter manager that changes filter merits in the Windows Registry
.
or Video for Windows
allow end-users to perform basic video-related tasks such as re-encoding using a different codec and editing files and streams. The convenience offered by an end-user GUI is apparent since the AVI
format and codecs used by Video for Windows
still remain in use, for example VirtualDub
.
Multimedia framework
A multimedia framework is a software framework that handles media on a computer and through a network. A good multimedia framework offers an intuitive API and a modular architecture to easily add support for new audio, video and container formats and transmission protocols...
and API
Application programming interface
An application programming interface is a source code based specification intended to be used as an interface by software components to communicate with each other...
produced by Microsoft
Microsoft
Microsoft Corporation is an American public multinational corporation headquartered in Redmond, Washington, USA that develops, manufactures, licenses, and supports a wide range of products and services predominantly related to computing through its various product divisions...
for software developer
Software developer
A software developer is a person concerned with facets of the software development process. Their work includes researching, designing, developing, and testing software. A software developer may take part in design, computer programming, or software project management...
s to perform various operations with media files or streams. It is the replacement for Microsoft's earlier Video for Windows
Video for Windows
Video for Windows was a multimedia framework developed by Microsoft that allowed Microsoft Windows to play digital video.-Overview:...
technology. Based on the Microsoft Windows
Microsoft Windows
Microsoft Windows is a series of operating systems produced by Microsoft.Microsoft introduced an operating environment named Windows on November 20, 1985 as an add-on to MS-DOS in response to the growing interest in graphical user interfaces . Microsoft Windows came to dominate the world's personal...
Component Object Model
Component Object Model
Component Object Model is a binary-interface standard for software componentry introduced by Microsoft in 1993. It is used to enable interprocess communication and dynamic object creation in a large range of programming languages...
(COM) framework, DirectShow provides a common interface for media across various programming language
Programming language
A programming language is an artificial language designed to communicate instructions to a machine, particularly a computer. Programming languages can be used to create programs that control the behavior of a machine and/or to express algorithms precisely....
s, and is an extensible, filter
Filter (software)
A filter is a computer program to process a data stream. Some operating systems such as Unix are rich with filter programs. Even Windows has some simple filters built into its command shell, most of which have significant enhancements relative to the similar filter commands that were available in...
-based framework that can render or record media files on demand at the request of the user or developer. The DirectShow development tools and documentation were originally distributed as part of the DirectX
DirectX
Microsoft DirectX is a collection of application programming interfaces for handling tasks related to multimedia, especially game programming and video, on Microsoft platforms. Originally, the names of these APIs all began with Direct, such as Direct3D, DirectDraw, DirectMusic, DirectPlay,...
SDK. Currently, they are distributed as part of the Windows SDK (formerly known as the Platform SDK).
DirectShow's counterparts on other platforms include Apple's QuickTime
QuickTime
QuickTime is an extensible proprietary multimedia framework developed by Apple Inc., capable of handling various formats of digital video, picture, sound, panoramic images, and interactivity. The classic version of QuickTime is available for Windows XP and later, as well as Mac OS X Leopard and...
framework and various Linux multimedia frameworks such as GStreamer
GStreamer
GStreamer is a pipeline-based multimedia framework written in the C programming language with the type system based on GObject.GStreamer allows a programmer to create a variety of media-handling components, including simple audio playback, audio and video playback, recording, streaming and editing...
or Xine
Xine
xine is a multimedia playback engine for Unix-like operating systems released under the GNU General Public License. xine is built around a shared library that supports different frontend player applications. Another important feature of xine is the ability to manually correct the synchronization...
. Microsoft plans to completely replace DirectShow gradually with Media Foundation
Media Foundation
Microsoft Media Foundation is a COM-based multimedia framework pipeline and infrastructure platform for digital media in Windows Vista, Windows 7 & Windows 8...
in future Windows versions. Windows Vista
Windows Vista
Windows Vista is an operating system released in several variations developed by Microsoft for use on personal computers, including home and business desktops, laptops, tablet PCs, and media center PCs...
and Windows 7 applications use Media Foundation instead of DirectShow for several media related tasks.
History
The direct predecessor of DirectShow, ActiveMovieActiveMovie
ActiveMovie is a streaming media technology now known as DirectShow, developed by Microsoft to replace Video for Windows. ActiveMovie allows users to view media streams, whether distributed via the Internet, an intranet and CD-ROMs....
(codenamed Quartz), was originally chartered to provide MPEG-1
MPEG-1
MPEG-1 is a standard for lossy compression of video and audio. It is designed to compress VHS-quality raw digital video and CD audio down to 1.5 Mbit/s without excessive quality loss, making video CDs, digital cable/satellite TV and digital audio broadcasting possible.Today, MPEG-1 has become...
file playback support for Windows. It was also intended as a future replacement for media processing frameworks like Video for Windows
Video for Windows
Video for Windows was a multimedia framework developed by Microsoft that allowed Microsoft Windows to play digital video.-Overview:...
, which had never been designed to handle codec
Codec
A codec is a device or computer program capable of encoding or decoding a digital data stream or signal. The word codec is a portmanteau of "compressor-decompressor" or, more commonly, "coder-decoder"...
s that put video frames into a different order during the compression process, and the Media Control Interface
Media Control Interface
The Media Control Interface — MCI for short — is a high-level API developed by Microsoft and IBM for controlling multimedia peripherals connected to a Microsoft Windows or OS/2 computer, such as CD-ROM players and audio controllers....
, which had never been fully ported to a 32-bit environment and did not utilize COM.
The Quartz team started with an existing project called Clockwork. Clockwork was a modular media processing framework in which semi-independent components worked together to process digital media streams, and had previously been used in several projects, including the Microsoft Interactive Television (MITV) project and another project named Tiger.
ActiveMovie was announced in March 1996, and released in May 1996, bundled with the beta version of Internet Explorer 3
Internet Explorer 3
Microsoft Internet Explorer 3 is a graphical web browser released on August 13, 1996 by Microsoft for Microsoft Windows and on January 8, 1997 for Apple Mac OS . It began serious competition against Netscape Navigator in the first Browser war...
.0. In March 1997, Microsoft announced that ActiveMovie would become part of the DirectX
DirectX
Microsoft DirectX is a collection of application programming interfaces for handling tasks related to multimedia, especially game programming and video, on Microsoft platforms. Originally, the names of these APIs all began with Direct, such as Direct3D, DirectDraw, DirectMusic, DirectPlay,...
5 suite of technologies, and around July started referring to it as DirectShow, reflecting Microsoft's efforts at the time to consolidate technologies that worked directly with hardware under a common naming scheme. DirectShow became a standard component of all Windows operating systems starting with Windows 98
Windows 98
Windows 98 is a graphical operating system by Microsoft. It is the second major release in the Windows 9x line of operating systems. It was released to manufacturing on 15 May 1998 and to retail on 25 June 1998. Windows 98 is the successor to Windows 95. Like its predecessor, it is a hybrid...
; however it is available on Windows 95
Windows 95
Windows 95 is a consumer-oriented graphical user interface-based operating system. It was released on August 24, 1995 by Microsoft, and was a significant progression from the company's previous Windows products...
by installing the latest available DirectX redistributable. In DirectX version 8.0, DirectShow became part of the mainline distribution of the DirectX SDK and was placed alongside other DirectX APIs.
In October 2004, DirectShow was removed from the main DirectX distribution and relocated to the DirectX Extras download. In April 2005, DirectShow was removed entirely from DirectX and moved to the Windows Server 2003 SP1 version of the Microsoft Platform SDK. The DirectX SDK was, however, still required to build some of the DirectShow samples.
Since November 2007, DirectShow APIs are part of the Windows SDK. It includes several new enhancements, codecs and filter updates such as the Enhanced Video Renderer (EVR) and DXVA 2.0 (DirectX Video Acceleration).
Architecture
DirectShow divides a complex multimedia task (e.g. video playback) into a sequence of fundamental processing steps known as filtersFilter (software)
A filter is a computer program to process a data stream. Some operating systems such as Unix are rich with filter programs. Even Windows has some simple filters built into its command shell, most of which have significant enhancements relative to the similar filter commands that were available in...
. Each filter — which represents one stage in the processing of the data — has input and/or output pins that may be used to connect the filter to other filters. The generic nature of this connection mechanism enables filters to be connected in various ways so as to implement different complex functions. To implement a specific complex task, a developer must first build a filter graph
Filter graph
A filter graph is used in multimedia processing. For example to capture video from a webcam. Filters take input, process it or change the input, and then output the process data. An example of a filter, would be a video codec that takes raw uncompressed video and compresses it using a video...
by creating instances of the required filters, and then connecting the filters together.
There are three main types of filters:
- Source filters: These provide the source streams of data. For example, reading raw bytes from any media file.
- Transform filters: These transform data that is provided from other filter's output. For example, doing a transform such as adding text on top of video or uncompressing an MPEG frame.
- Renderer filters: These render the data. For example, sending audio to the sound card, drawing video on the screen or writing data to a file.
During the rendering process, the filter graph searches the Windows Registry
Windows registry
The Windows Registry is a hierarchical database that stores configuration settings and options on Microsoft Windows operating systems. It contains settings for low-level operating system components as well as the applications running on the platform: the kernel, device drivers, services, SAM, user...
for registered filters and builds its graph of filters based on the locations provided. After this, it connects the filters together, and, at the developer's request, executes (i.e., plays, pauses, etc.) the created graph. DirectShow filter graphs are widely used in video playback (in which the filters implement functions such as file parsing, video and audio demultiplexing, decompressing and rendering) as well as for video and audio recording, editing, encoding, transcoding or network transmission of media. Interactive tasks such as DVD navigation may also be controlled by DirectShow.
In the above example, from left to right, the graph contains a source filter to read an MP3 file, stream splitter and decoder filters to parse and decode the audio, and a rendering filter to play the raw audio samples. Each filter has one or more pins that can be used to connect that filter to other filters. Every pin functions either as an output or input source for data to flow from one filter to another. Depending on the filter, data is either "pulled" from an input pin or "pushed" to an output pin in order to transfer data between filters. Each pin can only connect to one other pin and they have to agree on what kind of data they are sending.
Most filters are built using a set of C++ classes provided in the DirectShow SDK, called the DirectShow Base Classes. These handle much of the creation, registration and connection logic for the filter. For the filter graph to use filters automatically, they need to be registered in a separate DirectShow registry entry as well as being registered with COM. This registration can be managed by the DirectShow Base Classes. However, if the application adds the filters manually, they do not need to be registered at all.
Unfortunately, it is difficult to modify a graph that is already running. It is usually easier to stop the graph and create a new graph from scratch. Starting with DirectShow 8.0, dynamic graph building, dynamic reconnection, and filter chains were introduced to help alter the graph while it was running. However, many filter vendors ignore this feature, making graph modification problematic after a graph has begun processing.
Features
By default, DirectShow includes a number of filters for decoding some common media file formats such as MPEG-1MPEG-1
MPEG-1 is a standard for lossy compression of video and audio. It is designed to compress VHS-quality raw digital video and CD audio down to 1.5 Mbit/s without excessive quality loss, making video CDs, digital cable/satellite TV and digital audio broadcasting possible.Today, MPEG-1 has become...
, MP3
MP3
MPEG-1 or MPEG-2 Audio Layer III, more commonly referred to as MP3, is a patented digital audio encoding format using a form of lossy data compression...
, Windows Media Audio
Windows Media Audio
Windows Media Audio is an audio data compression technology developed by Microsoft. The name can be used to refer to its audio file format or its audio codecs. It is a proprietary technology that forms part of the Windows Media framework. WMA consists of four distinct codecs...
, Windows Media Video
Windows Media Video
'Windows Media Video is a video compression format for several proprietary codecs developed by Microsoft. The original video format, known as WMV, was originally designed for Internet streaming applications, as a competitor to RealVideo. The other formats, such as WMV Screen and WMV Image, cater...
, MIDI
Musical Instrument Digital Interface
MIDI is an industry-standard protocol, first defined in 1982 by Gordon Hall, that enables electronic musical instruments , computers and other electronic equipment to communicate and synchronize with each other...
, media containers such as AVI
Audio Video Interleave
Audio Video Interleave , known by its acronym AVI, is a multimedia container format introduced by Microsoft in November 1992 as part of its Video for Windows technology. AVI files can contain both audio and video data in a file container that allows synchronous audio-with-video playback...
, ASF
Advanced Systems Format
Advanced Systems Format is Microsoft's proprietary digital audio/digital video container format, especially meant for streaming media...
, WAV
WAV
Waveform Audio File Format , is a Microsoft and IBM audio file format standard for storing an audio bitstream on PCs...
, some splitters/demultiplexers, multiplexers, source and sink
Sink (computing)
In computing, a sink or event sink is a class or function designed to receive incoming events from another object or function. This is commonly implemented in C++ as callbacks. Object-oriented languages, such as Java and C#, have built-in support for sinks by allowing events to be fired to...
filters and some static image filters. Since the associated patented technologies are licensed in Windows, no license fees are required (e.g., to Fraunhofer
Fraunhofer Society
The Fraunhofer Society is a German research organization with 60 institutes spread throughout Germany, each focusing on different fields of applied science . It employs around 18,000, mainly scientists and engineers, with an annual research budget of about €1.65 billion...
, for MP3). Some codecs such as MPEG-4 Advanced Simple Profile
MPEG-4
MPEG-4 is a method of defining compression of audio and visual digital data. It was introduced in late 1998 and designated a standard for a group of audio and video coding formats and related technology agreed upon by the ISO/IEC Moving Picture Experts Group under the formal standard ISO/IEC...
, AAC
Advanced Audio Coding
Advanced Audio Coding is a standardized, lossy compression and encoding scheme for digital audio. Designed to be the successor of the MP3 format, AAC generally achieves better sound quality than MP3 at similar bit rates....
, H.264, Vorbis
Vorbis
Vorbis is a free software / open source project headed by the Xiph.Org Foundation . The project produces an audio format specification and software implementation for lossy audio compression...
and containers MOV
MOV
MOV may refer to:* MOV , a mnemonic for the copying of data from one location to another in the X86 assembly language* .mov, filename extension for the QuickTime multimedia file format...
, MP4 are easily added from 3rd parties. Incorporating support for additional codecs such as these can involve paying the licensing fees to the involved codec technology developer or patent holder.
However, DirectShow's standard format repertoire can be easily expanded by means of a variety of filters. Such filters enable DirectShow to support virtually any container format and any audio or video codec. For example, filters have been developed for Ogg Vorbis, Musepack
Musepack
Musepack or MPC is an open source lossy audio codec, specifically optimized for transparent compression of stereo audio at bitrates of 160–180 kbit/s...
, and AC3. Finally, there are "bridge" filters that simultaneously support multiple formats, as well as functions like stream multiplexing, by exposing the functionality of underlying multimedia APIs such as VLC
VLC media player
VLC media player is a free and open source media player and multimedia framework written by the VideoLAN project.VLC is a portable multimedia player, encoder, and streamer supporting many audio and video codecs and file formats as well as DVDs, VCDs, and various streaming protocols. It is able to...
.
The amount of work required to implement a filter graph depends on several factors. In the simplest case, DirectShow can create a filter graph automatically from a source such as a file or URL. If this is not possible, the developer may be able to manually create a filter graph from a source file, possibly with the addition of a custom filter, and then let DirectShow complete the filter graph by connecting the filters together. At the next level, the developer must build the filter graph from scratch by manually adding and connecting each desired filter. Finally, in cases where an essential filter is unavailable, the developer must create a custom filter before a filter graph can be built.
Unlike the main C API of QuickTime where it is necessary to call MoviesTask in a loop to load a media file, DirectShow handles all of this in a transparent way. It creates several background threads that smoothly play the requested file or URL without much work required from the programmer. Also in contrast to QuickTime, nothing special is required for loading a URL instead of a local file on disk – DirectShow's filter graph abstracts these details from the programmer, although recent developments in QuickTime (including an ActiveX control) have reduced this disparity.
DirectShow Editing Services
DirectShow Editing Services (DES), introduced in DirectX 8.0/Windows XPWindows XP
Windows XP is an operating system produced by Microsoft for use on personal computers, including home and business desktops, laptops and media centers. First released to computer manufacturers on August 24, 2001, it is the second most popular version of Windows, based on installed user base...
is an API targeted at video editing tasks and built on top of the core DirectShow architecture. DirectShow Editing Services was introduced for Microsoft's Windows Movie Maker
Windows Movie Maker
Windows Movie Maker is a video creating/editing software application, included in Microsoft Windows Me, XP, and Vista. It contains features such as effects, transitions, titles/credits, audio track, timeline narration, and Auto Movie. New effects and transitions can be made and existing ones can be...
. It includes APIs for timeline and switching services, resizing, cropping, video and audio effects, as well as transitions, keying, automatic frame rate
Frame rate
Frame rate is the frequency at which an imaging device produces unique consecutive images called frames. The term applies equally well to computer graphics, video cameras, film cameras, and motion capture systems...
and sample rate conversion
Sample rate conversion
Sample rate conversion is the process of converting a signal from one sampling rate to another, while changing the information carried by the signal as little as possible...
and such other features which are used in non-linear video editing allowing creation of composite media out of a number of source audio and video streams. DirectShow Editing Services allow higher-level run-time compositing, seeking support, and graph management, while still allowing applications to access lower-level DirectShow functions.
While the original API is in C++, DirectShow Editing Services is accessible in any Microsoft .NET compatible language including Microsoft Visual C#
Microsoft Visual C Sharp
Microsoft Visual C# is Microsoft's implementation of the C# specification, included in the Microsoft Visual Studio suite of products. It is based on the ECMA/ISO specification of the C# language, which Microsoft also created. While multiple implementations of the specification exist, Visual C# is...
and Microsoft Visual Basic
Visual Basic .NET
Visual Basic .NET , is an object-oriented computer programming language that can be viewed as an evolution of the classic Visual Basic , which is implemented on the .NET Framework...
by using a third-party code library called "DirectShowNet Library". Alternatively, the entire DirectShow API, including DirectShow Editing Services, can be accessed from Borland Delphi
Borland Delphi
Embarcadero Delphi is an integrated development environment for console, desktop graphical, web, and mobile applications.Delphi's compilers use its own Object Pascal dialect of Pascal and generate native code for 32- and 64-bit Windows operating systems, as well as 32-bit Mac OS X and iOS...
5, 6 and 7, C++ Builder
C++ Builder
C++Builder is a rapid application development environment, developed by Borland and owned by Embarcadero Technologies, for writing programs in the C++ programming language. C++Builder combines the Visual Component Library and IDE written in Delphi with a C++ compiler...
6, and from later versions with a few minor modifications, using a third party software library called "DSPack".
Video rendering filters
Originally, in Windows 9xWindows 9x
Windows 9x is a generic term referring to a series of Microsoft Windows computer operating systems produced since 1995, which were based on the original and later modified Windows 95 kernel...
, DirectShow used the Video Renderer filter. This drew the images using DirectDraw
DirectDraw
DirectDraw is part of Microsoft's DirectX API. DirectDraw is used to render graphics in applications where top performance is important. DirectDraw also allows applications to run fullscreen or embedded in a window such as most other MS Windows applications. DirectDraw uses hardware acceleration if...
3, but could also fall back to GDI
Graphics Device Interface
The Graphics Device Interface is a Microsoft Windows application programming interface and core operating system component responsible for representing graphical objects and transmitting them to output devices such as monitors and printers....
or overlay drawing modes in some circumstances (depending upon the visibility of the video window and the video card's capabilities). It had limited access to the video window. Video for Windows
Video for Windows
Video for Windows was a multimedia framework developed by Microsoft that allowed Microsoft Windows to play digital video.-Overview:...
had been plagued with deadlock
Deadlock
A deadlock is a situation where in two or more competing actions are each waiting for the other to finish, and thus neither ever does. It is often seen in a paradox like the "chicken or the egg"...
s caused by applications' incorrect handling of the video windows, so in early DirectShow releases, the handle to the playback window was hidden from applications. There was also no reliable way to draw caption text or graphics on top of the video.
DirectShow 6.0, released as part of DirectX Media
DirectX Media
DirectX Media is a set of multimedia-related APIs for Microsoft Windows complementing DirectX. It included DirectAnimation for 2D/3D web animation, DirectShow for multimedia playback and streaming media, DirectX Transform for web interactivity, and Direct3D Retained Mode for higher level 3D graphics...
introduced the Overlay Mixer renderer designed for DVD
DVD
A DVD is an optical disc storage media format, invented and developed by Philips, Sony, Toshiba, and Panasonic in 1995. DVDs offer higher storage capacity than Compact Discs while having the same dimensions....
playback and broadcast
Broadcasting
Broadcasting is the distribution of audio and video content to a dispersed audience via any audio visual medium. Receiving parties may include the general public or a relatively large subset of thereof...
video streams with closed captioning
Closed captioning
Closed captioning is the process of displaying text on a television, video screen or other visual display to provide additional or interpretive information to individuals who wish to access it...
and subtitles
Subtitle (captioning)
Subtitles are textual versions of the dialog in films and television programs, usually displayed at the bottom of the screen. They can either be a form of written translation of a dialog in a foreign language, or a written rendering of the dialog in the same language, with or without added...
. The Overlay Mixer uses DirectDraw 5 for rendering. Downstream connection with the Video Renderer is required for window management. Overlay Mixer also supports Video Port Extensions (VPE), enabling it to work with analog TV tuners
TV tuner card
A TV tuner card is a kind of television tuner that allows television signals to be received by a computer. Most TV tuners also function as video capture cards, allowing them to record television programs onto a hard disk much like the Tivo digital video recorder does.-Variants: The interfaces for...
with overlay capability (sending video directly to a video card via an analog link rather than via the PCI bus
Peripheral Component Interconnect
Conventional PCI is a computer bus for attaching hardware devices in a computer...
). Overlay Mixer also supports DXVA
DXVA
DirectX Video Acceleration is a Microsoft API specification for the Microsoft Windows and Xbox 360 platforms that allows video decoding to be hardware accelerated. The pipeline allows certain CPU-intensive operations such as iDCT, motion compensation and deinterlacing to be offloaded to the GPU...
connections. Because it always renders in overlay, full-screen video to TV-out
TV-out
The term TV-out is commonly used to label the connector of equipment providing an analog video signal acceptable for a television AV input. TV-out is different from AV-out in that it only provides video, no audio....
is always activated.
Windows XP
Windows XP
Windows XP is an operating system produced by Microsoft for use on personal computers, including home and business desktops, laptops and media centers. First released to computer manufacturers on August 24, 2001, it is the second most popular version of Windows, based on installed user base...
introduced a new filter called the Video Mixing Renderer 7 (VMR-7 or sometimes just referred to as VMR). The number 7 was because VMR-7 only used DirectDraw
DirectDraw
DirectDraw is part of Microsoft's DirectX API. DirectDraw is used to render graphics in applications where top performance is important. DirectDraw also allows applications to run fullscreen or embedded in a window such as most other MS Windows applications. DirectDraw uses hardware acceleration if...
version 7 to render the video and did not have the option to use GDI drawing. The main new feature of VMR-7 was the ability to mix multiple streams and graphics with alpha blending, allowing applications to draw text and graphics over the video and support custom effects. It also featured a "windowless mode" (access to the composited image before it is rendered) which fixed the problems with access to the window handle. VMR-7 was only officially released for Windows XP
Windows XP
Windows XP is an operating system produced by Microsoft for use on personal computers, including home and business desktops, laptops and media centers. First released to computer manufacturers on August 24, 2001, it is the second most popular version of Windows, based on installed user base...
.
DirectX 9 included VMR-9. This version uses Direct3D
Direct3D
Direct3D is part of Microsoft's DirectX application programming interface . Direct3D is available for Microsoft Windows operating systems , and for other platforms through the open source software Wine. It is the base for the graphics API on the Xbox and Xbox 360 console systems...
9 instead of DirectDraw, allowing developers to transform video images using the Direct3D pixel shaders. It is available for all Windows platforms as part of the DirectX 9 redistributable. As VMR-7 it provides a Windowless Mode. However, unlike Overlay mixer or VMR-7 it does not support video ports.
Windows Vista
Windows Vista
Windows Vista is an operating system released in several variations developed by Microsoft for use on personal computers, including home and business desktops, laptops, tablet PCs, and media center PCs...
and Windows 7 ship with a new renderer, available as both a Media Foundation
Media Foundation
Microsoft Media Foundation is a COM-based multimedia framework pipeline and infrastructure platform for digital media in Windows Vista, Windows 7 & Windows 8...
component and a DirectShow filter, called the Enhanced Video Renderer (EVR). EVR is designed to work with Desktop Window Manager
Desktop Window Manager
Desktop Window Manager is the window manager in Windows Vista and Windows 7 that enables the Windows Aero graphical user interface and visual theme. The Desktop Window Manager requires video cards supporting DirectX 9.0 and Shader Model 2.0. DWM is not included with Windows Vista Starter edition...
and supports DXVA 2.0
DXVA
DirectX Video Acceleration is a Microsoft API specification for the Microsoft Windows and Xbox 360 platforms that allows video decoding to be hardware accelerated. The pipeline allows certain CPU-intensive operations such as iDCT, motion compensation and deinterlacing to be offloaded to the GPU...
, which is available on Windows Vista and Windows 7. It offers better performance and better quality according to Microsoft.
Awards
On January 8, 2007. Microsoft received the Emmy award for Streaming Media Architectures and Components at the 58th Annual Technology & Engineering EMMY Awards.Simplicity
Commanding DirectShow to play a file is a relatively simple task. However, while programming more advanced customizations, such as commanding DirectShow to display certain windows messages from the video window or creating custom filters, many developers complain of difficulties. It is regarded as one of Microsoft's most complex development libraries/APIs.Developers rarely create DirectShow filters from scratch. Rather, they employ DirectShow Base Classes. The Base Classes can often simplify development, allowing the programmer to bypass certain tasks. However, the process may remain relatively complex; the code found in the Base Classes is nearly half the size of the entire MFC library
Microsoft Foundation Class Library
The Microsoft Foundation Class Library is a library that wraps portions of the Windows API in C++ classes, including functionality that enables them to use a default application framework...
. As a result, even with the Base Classes, the number of COM objects that DirectShow contains often overwhelms developers. In some cases, DirectShow's API deviates from traditional COM rules, particularly with regard to the parameters used for methods
Method (computer science)
In object-oriented programming, a method is a subroutine associated with a class. Methods define the behavior to be exhibited by instances of the associated class at program run time...
. To overcome their difficulties with DirectShow's unique COM rules, developers often turn to a higher level API that uses DirectShow, notably, Windows Media Player SDK, an API provides the developer with an ActiveX Control that has fewer COM interfaces to deal with.
Although DirectShow is capable of dynamically building a graph to render a given media type, in certain instances it is difficult for developers to rely on this functionality and they need to resort to manually building filter graphs if the resulting filter graph is variable. It is possible for filter graphs to change over time as new filters are installed on the computer.
Digital rights management
DirectShow has also been criticized for its support of digital rights managementDigital rights management
Digital rights management is a class of access control technologies that are used by hardware manufacturers, publishers, copyright holders and individuals with the intent to limit the use of digital content and devices after sale. DRM is any technology that inhibits uses of digital content that...
(DRM); however, DirectShow itself has minimal support for DRM in its API. The Windows Media Player SDK more significantly reflects Microsoft's adherence to DRM.
Codec hell
Codec hell (a term derived from DLL hellDLL hell
In computing, DLL Hell is a term for the complications that arise when working with dynamic link libraries used with Microsoft Windows operating systems, particularly legacy 16-bit editions which all run in a single memory space....
) is when multiple DirectShow filters conflict for performing the same task. A large number of companies now develop codecs in the form of DirectShow filters, resulting in the presence of several filters that can decode the same media type. This issue is further exacerbated by DirectShow's merit system, where filter implementations end up competing with one another by registering themselves with increasingly elevated priority.
Microsoft's Ted Youmans explained that "DirectShow
DirectShow
DirectShow , codename Quartz, is a multimedia framework and API produced by Microsoft for software developers to perform various operations with media files or streams. It is the replacement for Microsoft's earlier Video for Windows technology...
was based on the merit system, with the idea being that, using a combination of the filter’s merit and how specific the media type/sub type is, one could reasonably pick the right codec every time. It wasn't really designed for a competing merit nuclear arms race."
A tool to help in the troubleshooting of "codec hell" issues usually referenced is the GSpot
GSpot
GSpot is a Windows-based freeware designed to identify codecs used in video files. In addition, the application checks if the required DirectShow filters or Video for Windows codecs are installed and configured for proper playback...
Codec Information Appliance, which can be useful in determining what codec is used to render video files in AVI
Audio Video Interleave
Audio Video Interleave , known by its acronym AVI, is a multimedia container format introduced by Microsoft in November 1992 as part of its Video for Windows technology. AVI files can contain both audio and video data in a file container that allows synchronous audio-with-video playback...
and other containers. GraphEdit
GraphEdit
GraphEdit is a utility which is part of the Microsoft DirectShow SDK. It is a visual tool for building and testing filter graphs for DirectShow. Filters are displayed as boxes, with a text caption showing the name of the filter. Pins appear as small squares along the edge of the filter...
can also help understanding the sequence of filters that DirectShow is using to render the media file. Codec hell can be resolved by manually building filter graphs, using a media player that supports ignoring or overriding filter merits, or by using a filter manager that changes filter merits in the Windows Registry
Windows registry
The Windows Registry is a hierarchical database that stores configuration settings and options on Microsoft Windows operating systems. It contains settings for low-level operating system components as well as the applications running on the platform: the kernel, device drivers, services, SAM, user...
.
End-user tools
DirectShow, being a developer-centric framework and API, does not directly offer end-user control over encoding content, nor does it incorporate a user interface for encoding using installed codecs or to different formats; instead, it relies on developers to develop software using the API. In contrast, other multimedia frameworks such as QuickTimeQuickTime
QuickTime is an extensible proprietary multimedia framework developed by Apple Inc., capable of handling various formats of digital video, picture, sound, panoramic images, and interactivity. The classic version of QuickTime is available for Windows XP and later, as well as Mac OS X Leopard and...
or Video for Windows
Video for Windows
Video for Windows was a multimedia framework developed by Microsoft that allowed Microsoft Windows to play digital video.-Overview:...
allow end-users to perform basic video-related tasks such as re-encoding using a different codec and editing files and streams. The convenience offered by an end-user GUI is apparent since the AVI
Audio Video Interleave
Audio Video Interleave , known by its acronym AVI, is a multimedia container format introduced by Microsoft in November 1992 as part of its Video for Windows technology. AVI files can contain both audio and video data in a file container that allows synchronous audio-with-video playback...
format and codecs used by Video for Windows
Video for Windows
Video for Windows was a multimedia framework developed by Microsoft that allowed Microsoft Windows to play digital video.-Overview:...
still remain in use, for example VirtualDub
VirtualDub
VirtualDub is a video capture and video processing utility for Microsoft Windows written by Avery Lee.It is designed to process linear video streams, including filtering and recompression...
.
See also
- GraphStudioGraphStudioGraphStudio is a free software project for Microsoft Windows and is intended mainly for developers who work with the DirectShow technology. It mimics the look and feel of the GraphEdit software that has always been a part of DirectXSDK and PlatformSDK but has not been developed for several...
– open sourceOpen sourceThe term open source describes practices in production and development that promote access to the end product's source materials. Some consider open source a philosophy, others consider it a pragmatic methodology...
GraphEdit project - DirectX Media Objects
- DirectX pluginDirectX pluginIn computer music and professional audio creation, a DirectX plugin is a software processing component that can be loaded as a plugin into host applications to allow real-time processing, audio effects, mixing audio or act as virtual synthesizers...
s - DirectX Video AccelerationDXVADirectX Video Acceleration is a Microsoft API specification for the Microsoft Windows and Xbox 360 platforms that allows video decoding to be hardware accelerated. The pipeline allows certain CPU-intensive operations such as iDCT, motion compensation and deinterlacing to be offloaded to the GPU...
- DirectShowPlayerDSPlayer'DSPlayer' is a digital media player application for Microsoft Windows developed by an international team of developers directed by the founder Dipl.-Inf. Martin Offenwanger. The name has been made of DirectShow Player. DSPlayers support of media file formats is only limited by installed DirectShow...
External links
- DirectShow on MSDN – official documentation
- J. River DirectShow Playback Guide – tutorial on DirectShow with general-purpose information
- VideoLab – video processing library with DirectShow support (free for non commercial purposes)
- AC3 Directshow Filter – AC3 audio filters with DirectShow support