Media Resource Control Protocol
Encyclopedia
Media Resource Control Protocol (MRCP) is a communication protocol used by speech servers to provide various services (such as speech recognition
and speech synthesis
) to their clients. MRCP relies on another protocol, such as Real Time Streaming Protocol
(RTSP) or Session Initiation Protocol
(SIP
) for establishing a control session and audio streams between the client and the server.
MRCP uses a similar style of clear-text signaling as HTTP and many other Internet protocols, in which each message contains 3 sections: a first-line, a header and a body. The first line indicates the type of message as well as information such as response codes. The header contains a number of lines, each in the format: . The body, whose length is specified by the header, contains the details of the message.
Like HTTP, MRCP uses a request (usually issued by the client) and response model. Responses may simply acknowledge receipt of the request or give other information regarding its processing. For example, an MRCP client may request to send some audio data
for processing (say, for speech recognition), to which the server could respond with a message containing a suitable port number to send the data, since MRCP does not have support for audio data specifically as this would have to be handled by some other protocol, such as Real-time Transport Protocol
(RTP).
Currently an Internet Draft
of MRCP protocol version 2 has been submitted. Version 2 uses SIP
for managing sessions and audio streams between the server and the clients, whereas version 1 did not specify the underlying protocol.
MRCP has been adopted by a wide range of commercial voice applications, such as IBM WebSphere Voice Server, Microsoft Speech Server
, LumenVox Speech Engine, Nuance Recognizer, and Nuance Vocalizer.
Speech recognition
Speech recognition converts spoken words to text. The term "voice recognition" is sometimes used to refer to recognition systems that must be trained to a particular speaker—as is the case for most desktop recognition software...
and speech synthesis
Speech synthesis
Speech synthesis is the artificial production of human speech. A computer system used for this purpose is called a speech synthesizer, and can be implemented in software or hardware...
) to their clients. MRCP relies on another protocol, such as Real Time Streaming Protocol
Real Time Streaming Protocol
The Real Time Streaming Protocol is a network control protocol designed for use in entertainment and communications systems to control streaming media servers. The protocol is used for establishing and controlling media sessions between end points...
(RTSP) or Session Initiation Protocol
Session Initiation Protocol
The Session Initiation Protocol is an IETF-defined signaling protocol widely used for controlling communication sessions such as voice and video calls over Internet Protocol . The protocol can be used for creating, modifying and terminating two-party or multiparty sessions...
(SIP
Session Initiation Protocol
The Session Initiation Protocol is an IETF-defined signaling protocol widely used for controlling communication sessions such as voice and video calls over Internet Protocol . The protocol can be used for creating, modifying and terminating two-party or multiparty sessions...
) for establishing a control session and audio streams between the client and the server.
MRCP uses a similar style of clear-text signaling as HTTP and many other Internet protocols, in which each message contains 3 sections: a first-line, a header and a body. The first line indicates the type of message as well as information such as response codes. The header contains a number of lines, each in the format
Like HTTP, MRCP uses a request (usually issued by the client) and response model. Responses may simply acknowledge receipt of the request or give other information regarding its processing. For example, an MRCP client may request to send some audio data
Data
The term data refers to qualitative or quantitative attributes of a variable or set of variables. Data are typically the results of measurements and can be the basis of graphs, images, or observations of a set of variables. Data are often viewed as the lowest level of abstraction from which...
for processing (say, for speech recognition), to which the server could respond with a message containing a suitable port number to send the data, since MRCP does not have support for audio data specifically as this would have to be handled by some other protocol, such as Real-time Transport Protocol
Real-time Transport Protocol
The Real-time Transport Protocol defines a standardized packet format for delivering audio and video over IP networks. RTP is used extensively in communication and entertainment systems that involve streaming media, such as telephony, video teleconference applications, television services and...
(RTP).
Currently an Internet Draft
Internet Draft
Internet Drafts is a series of working documents published by the IETF. Typically, they are drafts for RFCs, but may be other works in progress not intended for publication as RFCs. It is considered inappropriate to rely on Internet Drafts for reference purposes...
of MRCP protocol version 2 has been submitted. Version 2 uses SIP
Session Initiation Protocol
The Session Initiation Protocol is an IETF-defined signaling protocol widely used for controlling communication sessions such as voice and video calls over Internet Protocol . The protocol can be used for creating, modifying and terminating two-party or multiparty sessions...
for managing sessions and audio streams between the server and the clients, whereas version 1 did not specify the underlying protocol.
MRCP has been adopted by a wide range of commercial voice applications, such as IBM WebSphere Voice Server, Microsoft Speech Server
Microsoft Speech Server
The Microsoft Speech Server is a product from Microsoft designed to allow the authoring and deployment of IVR applications incorporating Speech Recognition, Speech Synthesis and DTMF....
, LumenVox Speech Engine, Nuance Recognizer, and Nuance Vocalizer.
External links
- RFC 4463, A Media Resource Control Protocol (MRCP)
- UniMRCP, An open source cross-platform MRCP implementation