Cloud database
Encyclopedia
A cloud database is a database running on Cloud Computing
platform, such as Amazon EC2, GoGrid
and Rackspace. There are two common deployment models: Users can run databases on the cloud independently, using a Virtual Machine
image, or they can purchase access to a database service, maintained by a cloud database provider. Of the databases available on the cloud, some are SQL
-based and some use a NoSQL
data model.
A third option is managed database hosting on the cloud, where the database is not offered as a service, but the cloud provider hosts the database and manages it on the application owner's behalf. For example, cloud provider Rackspace offers managed hosting for MySQL databases.
Cloud computing
Cloud computing is the delivery of computing as a service rather than a product, whereby shared resources, software, and information are provided to computers and other devices as a utility over a network ....
platform, such as Amazon EC2, GoGrid
GoGrid
GoGrid is a cloud infrastructure service, hosting Linux and Windows virtual machines managed by a multi-server control panel and a RESTful API. GoGrid is privately held and competes in the dedicated hosting space against Rackspace and in the cloud computing hosting space with those listed in the...
and Rackspace. There are two common deployment models: Users can run databases on the cloud independently, using a Virtual Machine
Virtual machine
A virtual machine is a "completely isolated guest operating system installation within a normal host operating system". Modern virtual machines are implemented with either software emulation or hardware virtualization or both together.-VM Definitions:A virtual machine is a software...
image, or they can purchase access to a database service, maintained by a cloud database provider. Of the databases available on the cloud, some are SQL
SQL
SQL is a programming language designed for managing data in relational database management systems ....
-based and some use a NoSQL
Nosql
In computing, NoSQL is a broad class of database management systems that differ from the classic model of the relational database management system in some significant ways. These data stores may not require fixed table schemas, usually avoid join operations, and typically scale horizontally...
data model.
Deployment Model
There are two primary methods to run a database on the cloud:- Virtual MachineVirtual machineA virtual machine is a "completely isolated guest operating system installation within a normal host operating system". Modern virtual machines are implemented with either software emulation or hardware virtualization or both together.-VM Definitions:A virtual machine is a software...
Image - cloud platforms allow users to purchase virtual machine instances for a limited time. It is possible to run a database on these virtual machines. Users can either upload their own machine image with a database installed on it, or use ready-made machine images that already include an optimized installation of a database. For example, OracleOracleIn Classical Antiquity, an oracle was a person or agency considered to be a source of wise counsel or prophetic predictions or precognition of the future, inspired by the gods. As such it is a form of divination....
provides a ready-made machine image with an installation of Oracle Database 11g Enterprise Edition on Amazon EC2. - Database as a Service - some cloud platforms offer options for using a database as a service, without physically launching a virtual machine instance for the database. In this configuration, application owners do not have to install and maintain the database on their own. Instead, the database service provider takes responsibility for installing and maintaining the database, and application owners pay according to their usage. For example, Amazon Web Services provides two database services as part of its cloud offering, SimpleDB which is a NoSQL key-value store, and Amazon Relational Database ServiceAmazon Relational Database ServiceAmazon Relational Database Service or Amazon RDS is a distributed relational database service by Amazon.com. It is a web service running "in the cloud" and provides users a relational database for use in their applications. Amazon RDS makes it easy to set up, operate, and scale a relational database...
which is an SQL-based database service with a MySQL interface.
A third option is managed database hosting on the cloud, where the database is not offered as a service, but the cloud provider hosts the database and manages it on the application owner's behalf. For example, cloud provider Rackspace offers managed hosting for MySQL databases.
Architecture and Common Characteristics of Cloud Database Services
- Most database services offer web-based consoles, which the end user can use to provision and configure database instances. For example, the Amazon Web Services web console enables users to launch database instances, create snapshots (similar to backups) of databases, and monitor database statistics.
- Database services consist of a database manager component, which controls the underlying database instances using a service API. The service API is exposed to the end user, and permits users to perform maintenance and scaling operations on their database instances. For example, the Amazon Relational Database Service's service API enables creating a database instance, modifying the resources available to a database instance, deleting a database instance, creating a snapshot (similar to a backup) of a database, and restoring a database from a snapshot.
- Database services take care of scalability and high availability of the database. Scalability features differ between vendors - some offer auto-scaling, others enable the user to scale up using an API, but do not scale automatically. There is typically a commitment for a certain level of high availability (e.g. 99.9% or 99.99%).
Data Model
It is also important to differentiate between cloud databases which are relational as opposed to non-relational or NoSQL:- SQL DatabasesRelational databaseA relational database is a database that conforms to relational model theory. The software used in a relational database is called a relational database management system . Colloquial use of the term "relational database" may refer to the RDBMS software, or the relational database itself...
, such as Oracle DatabaseOracle DatabaseThe Oracle Database is an object-relational database management system produced and marketed by Oracle Corporation....
, Microsoft SQL ServerMicrosoft SQL ServerMicrosoft SQL Server is a relational database server, developed by Microsoft: It is a software product whose primary function is to store and retrieve data as requested by other software applications, be it those on the same computer or those running on another computer across a network...
and MySQLMySQLMySQL officially, but also commonly "My Sequel") is a relational database management system that runs as a server providing multi-user access to a number of databases. It is named after developer Michael Widenius' daughter, My...
, are one type of database which can be run on the cloud (either as a Virtual Machine Image or as a service, depending on the vendor). SQL databases are difficult to scale, meaning they are not natively suited to a cloud environment, although cloud database services based on SQL are attempting to address this challenge. - NoSQL DatabasesNosqlIn computing, NoSQL is a broad class of database management systems that differ from the classic model of the relational database management system in some significant ways. These data stores may not require fixed table schemas, usually avoid join operations, and typically scale horizontally...
, such as Apache Cassandra, CouchDBCouchDBApache CouchDB, commonly referred to as CouchDB, is an open source document-oriented database written mostly in the Erlang programming language. It is part of the NoSQL group of data stores and is designed for local replication and to scale horizontally across a wide range of devices...
and MongoDBMongoDBMongoDB is an open source, high-performance, schema-free, document-oriented database written in the C++ programming language...
, are another type of database which can run on the cloud. NoSQL databases are built to service heavy read/write loads and are able scale up and down easily, and therefore they are more natively suited to running on the cloud. However, most contemporary applications are built around an SQL data model, so working with NoSQL databases often requires a complete rewrite of application code.
Cloud Database Vendors
The following table provides main database vendors with a cloud database offering, organized by Machine Image vs. database as a service deployment, and by SQL vs. NoSQL data model. See the references next to the vendor names for more information.Virtual Machine Deployment | Database as a Service | |
---|---|---|
SQL Data Model |
|
Amazon Relational Database Service Amazon Relational Database Service or Amazon RDS is a distributed relational database service by Amazon.com. It is a web service running "in the cloud" and provides users a relational database for use in their applications. Amazon RDS makes it easy to set up, operate, and scale a relational database... (MySQL) Heroku Heroku is a cloud Platform as a Service run by the San Francisco, California-based company with the same name. Heroku led the way for a multi-language PaaS, introducing the 'polyglot platform'. Heroku initially supported the Ruby programming language, with Rack and Ruby on Rails. Heroku PaaS now... PostgreSQL as a Service (shared and dedicated database options) Xeround Xeround is a provider of cloud database software, launched in 2005. The company was founded by Sharon Barkai and Gilad Zlotkin. Zlotkin, a former research fellow at MIT Sloan School of Management, founded five other startups including Radview... Cloud Database - MySQL front-end |
NoSQL Data Model |
CouchDB Apache CouchDB, commonly referred to as CouchDB, is an open source document-oriented database written mostly in the Erlang programming language. It is part of the NoSQL group of data stores and is designed for local replication and to scale horizontally across a wide range of devices... on Amazon EC2 Hadoop Apache Hadoop is a software framework that supports data-intensive distributed applications under a free license. It enables applications to work with thousands of nodes and petabytes of data... on Amazon EC2 Neo4j Neo4j is an open-source graph database, implemented in Java. The developers describe Neo4j as "embedded, disk-based, fully transactional Java persistence engine that stores data structured in graphs rather than in tables". Neo4j version 1.0 was released in February, 2010. The community edition of... on Amazon EC2 or Microsoft Azure MongoDB MongoDB is an open source, high-performance, schema-free, document-oriented database written in the C++ programming language... on Amazon EC2 or Microsoft Azure |
Amazon SimpleDB Amazon SimpleDB is a distributed database written in Erlang by Amazon.com. It is used as a web service in concert with Amazon Elastic Compute Cloud and Amazon S3 and is part of Amazon Web Services. It was announced on December 13, 2007.... Google App Engine Google App Engine is a platform as a service cloud computing platform for developing and hosting web applications in Google-managed data centers. It virtualizes applications across multiple servers,... Datastore CouchDB Apache CouchDB, commonly referred to as CouchDB, is an open source document-oriented database written mostly in the Erlang programming language. It is part of the NoSQL group of data stores and is designed for local replication and to scale horizontally across a wide range of devices... Hosted Database MongoDB MongoDB is an open source, high-performance, schema-free, document-oriented database written in the C++ programming language... Database as a Service (several options) |