Thrift (protocol)
Encyclopedia
Thrift is an interface definition language that is used to define and create services for numerous languages. It is used as a remote procedure call
(RPC) framework and was developed at Facebook
for "scalable cross-language services development". It combines a software stack with a code generation engine to build services that work efficiently to a varying degree and seamlessly between C#, C++
(on POSIX
-compliant systems), Cappuccino
, Cocoa
, Erlang, Go
, Haskell
, Java
, OCaml, Perl
, PHP
, Python
, Ruby
, and Smalltalk
. Although developed at Facebook, it is now an open source
project in the Apache Software Foundation
. The implementation was described in an April 2007 technical paper released by Facebook, now hosted on Apache. To put it simply, Apache Thrift is a binary communication protocol.
Thrift supports a number of protocols:
The supported transports are:
Thrift also provides a number of servers, which are
Thrift will generate the code out of this descriptive information. For instance, in Java, the
for
Remote procedure call
In computer science, a remote procedure call is an inter-process communication that allows a computer program to cause a subroutine or procedure to execute in another address space without the programmer explicitly coding the details for this remote interaction...
(RPC) framework and was developed at Facebook
Facebook
Facebook is a social networking service and website launched in February 2004, operated and privately owned by Facebook, Inc. , Facebook has more than 800 million active users. Users must register before using the site, after which they may create a personal profile, add other users as...
for "scalable cross-language services development". It combines a software stack with a code generation engine to build services that work efficiently to a varying degree and seamlessly between C#, C++
C++
C++ is a statically typed, free-form, multi-paradigm, compiled, general-purpose programming language. It is regarded as an intermediate-level language, as it comprises a combination of both high-level and low-level language features. It was developed by Bjarne Stroustrup starting in 1979 at Bell...
(on POSIX
POSIX
POSIX , an acronym for "Portable Operating System Interface", is a family of standards specified by the IEEE for maintaining compatibility between operating systems...
-compliant systems), Cappuccino
Cappuccino (Application Development Framework)
Cappuccino is an open source application development framework for developing web applications that look and feel like desktop applications on Mac OS X. Cappuccino was developed by University of Southern California graduates Francisco Tolmasky, Tom Robinson and Ross Boucher, who are also the three...
, Cocoa
Cocoa (API)
Cocoa is Apple's native object-oriented application programming interface for the Mac OS X operating system and—along with the Cocoa Touch extension for gesture recognition and animation—for applications for the iOS operating system, used on Apple devices such as the iPhone, the iPod Touch, and...
, Erlang, Go
Go (programming language)
Go is a compiled, garbage-collected, concurrent programming language developed by Google Inc.The initial design of Go was started in September 2007 by Robert Griesemer, Rob Pike, and Ken Thompson. Go was officially announced in November 2009. In May 2010, Rob Pike publicly stated that Go was being...
, Haskell
Haskell (programming language)
Haskell is a standardized, general-purpose purely functional programming language, with non-strict semantics and strong static typing. It is named after logician Haskell Curry. In Haskell, "a function is a first-class citizen" of the programming language. As a functional programming language, the...
, Java
Java (programming language)
Java is a programming language originally developed by James Gosling at Sun Microsystems and released in 1995 as a core component of Sun Microsystems' Java platform. The language derives much of its syntax from C and C++ but has a simpler object model and fewer low-level facilities...
, OCaml, Perl
Perl
Perl is a high-level, general-purpose, interpreted, dynamic programming language. Perl was originally developed by Larry Wall in 1987 as a general-purpose Unix scripting language to make report processing easier. Since then, it has undergone many changes and revisions and become widely popular...
, PHP
PHP
PHP is a general-purpose server-side scripting language originally designed for web development to produce dynamic web pages. For this purpose, PHP code is embedded into the HTML source document and interpreted by a web server with a PHP processor module, which generates the web page document...
, Python
Python (programming language)
Python is a general-purpose, high-level programming language whose design philosophy emphasizes code readability. Python claims to "[combine] remarkable power with very clear syntax", and its standard library is large and comprehensive...
, Ruby
Ruby (programming language)
Ruby is a dynamic, reflective, general-purpose object-oriented programming language that combines syntax inspired by Perl with Smalltalk-like features. Ruby originated in Japan during the mid-1990s and was first developed and designed by Yukihiro "Matz" Matsumoto...
, and Smalltalk
Smalltalk
Smalltalk is an object-oriented, dynamically typed, reflective programming language. Smalltalk was created as the language to underpin the "new world" of computing exemplified by "human–computer symbiosis." It was designed and created in part for educational use, more so for constructionist...
. Although developed at Facebook, it is now an open source
Open source
The term open source describes practices in production and development that promote access to the end product's source materials. Some consider open source a philosophy, others consider it a pragmatic methodology...
project in the Apache Software Foundation
Apache Software Foundation
The Apache Software Foundation is a non-profit corporation to support Apache software projects, including the Apache HTTP Server. The ASF was formed from the Apache Group and incorporated in Delaware, U.S., in June 1999.The Apache Software Foundation is a decentralized community of developers...
. The implementation was described in an April 2007 technical paper released by Facebook, now hosted on Apache. To put it simply, Apache Thrift is a binary communication protocol.
Architecture
Thrift includes a complete stack for creating clients and servers. The top part is generated code from the Thrift definition. The services generate from this file client and processor code. In contrast to build-in types, created data structures are sent as result in generated code. The protocol and transport layer are part of the runtime library. With Thrift, it is possible to define a service and change the protocol and transport without recompiling your code. Besides the client part, Thrift includes a server infrastructures to tie protocols and transports together, like blocking, non-blocking, and multi-threaded servers. The underlying I/O part of the stack is differently implemented for different languages.Thrift supports a number of protocols:
- TBinaryProtocol – A straight-forward binary format encoding numeric values as binary. It is faster than the text protocol but more difficult to debug.
- TCompactProtocol – Very efficient, dense encoding of data.
- TDebugProtocol – Uses a human-readable text format to aid in debugging.
- TDenseProtocol – Similar to TCompactProtocol, striping off the meta information from what is transmitted.
- TJSONProtocol – Uses JSON for encoding of data.
- TSimpleJSONProtocol – A write-only protocol using JSON. Suitable for parsing by scripting languages.
The supported transports are:
- TFileTransport – This transport writes to a file.
- TFramedTransport – This transport is required when using a non-blocking server. It sends data in frames, where each frame is preceded by a length information.
- TMemoryTransport – Uses memory for I/O. The Java implementation uses a simple
ByteArrayOutputStream
internally. - TSocket – Uses blocking socket I/O for transport.
- TZlibTransport – Performs compression using zlib. Used in conjunction with another transport. Not available in the Java implementation.
Thrift also provides a number of servers, which are
- TNonblockingServer – A multi-threaded server using non-blocking io (Java implementation uses NIO channels). TFramedTransport must be used with this server.
- TSimpleServer – A single-threaded server using std blocking io. Useful for testing.
- TThreadPoolServer – A multi-threaded server using std blocking io.
Benefits
Some stated benefits of Thrift include:- Cross-language serialization with lower overhead than alternatives such as SOAPSOAPSOAP, originally defined as Simple Object Access Protocol, is a protocol specification for exchanging structured information in the implementation of Web Services in computer networks...
due to use of binary format - A lean and clean library. No framework to code to. No XMLXMLExtensible Markup Language is a set of rules for encoding documents in machine-readable form. It is defined in the XML 1.0 Specification produced by the W3C, and several other related specifications, all gratis open standards....
configuration files. - The language bindings feel natural. For example JavaJava (programming language)Java is a programming language originally developed by James Gosling at Sun Microsystems and released in 1995 as a core component of Sun Microsystems' Java platform. The language derives much of its syntax from C and C++ but has a simpler object model and fewer low-level facilities...
usesArrayList
. C++C++C++ is a statically typed, free-form, multi-paradigm, compiled, general-purpose programming language. It is regarded as an intermediate-level language, as it comprises a combination of both high-level and low-level language features. It was developed by Bjarne Stroustrup starting in 1979 at Bell...
usesstd::vector
. - The application-level wire format and the serialization-level wire format are cleanly separated. They can be modified independently.
- The predefined serialization styles include: binary, HTTP-friendly and compact binary.
- Doubles as cross-language file serializationSerializationIn computer science, in the context of data storage and transmission, serialization is the process of converting a data structure or object state into a format that can be stored and "resurrected" later in the same or another computer environment...
. - Soft versioning of the protocol. Thrift does not require a centralized and explicit mechanism like major-version/minor-version. Loosely coupled teams can freely evolve RPC calls.
- No build dependencies or non-standard software. No mix of incompatible software licenses.
Creating a Thrift service
Thrift is written in C++, but can create code for a number of languages. To create a Thrift service, one has to write Thrift files that describe it, generate the code in the destination language, and write some code to start the server and call it from the client. Here is a code example of such a description file:Thrift will generate the code out of this descriptive information. For instance, in Java, the
PhoneType
will be a simple enum
inside the POJOPojo
Pojo may refer to:* Pohja, the Swedish name for the Finnish municipality* POJO, abbreviation of Plain Old Java Object in computer programming...
for
Phone
class.See also
- Apache Avro (serialization system)Avro (serialization system)Avro is a remote procedure call and serialization framework developed within Apache's Hadoop project. It uses JSON for defining data types and protocols, and serializes data in a compact binary format...
- Abstract Syntax Notation OneAbstract Syntax Notation OneData generated at various sources of observation need to be transmitted to one or more locations that process it to generate useful results. For example, voluminous signal data collected by a radio telescope from outer space. The system recording the data and the system processing it later may be...
(ASN.1) - Caucho's Hessian
- CiscoCiscoCisco may refer to:Companies:*Cisco Systems, a computer networking company* Certis CISCO, corporatised entity of the former Commercial and Industrial Security Corporation in Singapore...
's EtchEtch (protocol)Etch is an open source, cross-platform framework for building network services, first announced in May 2008 by Cisco Systems. Etch encompasses a service description language, a compiler, and a number of language bindings... - GoogleGoogleGoogle Inc. is an American multinational public corporation invested in Internet search, cloud computing, and advertising technologies. Google hosts and develops a number of Internet-based services and products, and generates profit primarily from advertising through its AdWords program...
's Protocol BuffersProtocol BuffersProtocol Buffers are a serialization format with an interface description language developed by Google. The original Google implementation for C++, Java and Python is available under a free software, open source license.... - MicrosoftMicrosoftMicrosoft Corporation is an American public multinational corporation headquartered in Redmond, Washington, USA that develops, manufactures, licenses, and supports a wide range of products and services predominantly related to computing through its various product divisions...
's "MM (programming language)M is a programming language developed by Microsoft. The language is designed specifically for building textual domain-specific languages and software models with XAML....
" - ZeroCZeroCZeroC, Inc. is a company based in Palm Beach Gardens, Florida, U.S., revolving around the development and licensing of the Internet Communications Engine, or ICE, an object middleware system considered an alternative to CORBA and SOAP...
's Internet Communications EngineInternet Communications EngineThe Internet Communications Engine, or Ice, is an object-oriented middleware that provides object-oriented Remote Procedure Call, grid computing and Publish/subscribe functionality developed by ZeroC and dual-licensed under the GNU GPL and a proprietary license...
(ICE)