Database journalism
Encyclopedia
Database journalism or structured journalism is a principle in information management
whereby news content is organized around structured pieces of data
, as opposed to news stories.
Communication scholar Wiebke Loosen defines database journalism as "supplying databases
with raw material - articles, photos and other content - by using medium-agnostic publishing systems
and then making it available for different devices."
Some argue that such organization allows for a more efficient
workflow
. Reginald Chua, Editor of data and innovation at Thomson Reuters
, talk of structured journalism as a means to "maximize the shelf-life of news content" and "extracting more value" out of content.
wrote what is now considered the manifesto of database journalism in September 2006. In this article, Holovaty explained that most material collected by journalists is "structured information: the type of information that can be sliced-and-diced, in an automated fashion, by computers". For him, a key difference between database journalism and traditional journalism is that the latter produces articles
as the final product while the former produces databases of facts that are continually maintained and improved.
2007 saw a rapid development in database journalism. A December 2007 investigation by The Washington Post
(Fixing DC's schools) aggregated dozens of items about over 135 schools in a database that distributed content on a map, on individual webpages or within articles.
The importance of database journalism was highlighted when the Knight Foundation
awarded $1,100,000 to Adrian Holovaty's EveryBlock project, which offers local news at the level of city block
, drawing from existing data. The Pulitzer prize received by the St. Petersburg Times
' Politifact in April 2009 has been considered as a Color of Money moment by Aron Pilhofer, head of the New York Times
technology team, hinting that database journalism has been accepted by the trade and will develop, much like CAR did in the early 1990s.
Seeing journalistic content as data has pushed several news organizations to release APIs, including the BBC
, the Guardian
, the New York Times
and the American National Public Radio. By doing so, they let others aggregate the data they have collected and organized. In other words, they acknowledge that the core of their activity is not story-writing, but data gathering and data distribution.
Beginning with the early years of the 21st century, some researchers expanded the conceptual dimension for databases in journalism, and in digital journalism or cyberjournalism. A conceptual approach begins to consider databases as a specificity of digital journalism, expanding their meaning and identifying them with a specific code, as opposed to the approach which perceived them as sources for the production of journalistic stories, that is, as tools, according to some of the systematized studies in the 90s.
whereby journalists build stories using numerical data or databases as a primary material. In contrast, database journalism is an organizational structure
for content. It focuses on the constitution and maintenance of the database
upon which web
or mobile applications can be build, and from which journalists can extract data to carry out data-driven stories.
in the UK, launched in 2004, and Adrian Holovaty
's chicagocrime.org, released in 2005.
As of 2011, several databases could be considered journalistic in themselves. They include EveryBlock, OpenCorporates
, VoteWatch.eu.
Information management
Information management is the collection and management of information from one or more sources and the distribution of that information to one or more audiences. This sometimes involves those who have a stake in, or a right to that information...
whereby news content is organized around structured pieces of data
Data
The term data refers to qualitative or quantitative attributes of a variable or set of variables. Data are typically the results of measurements and can be the basis of graphs, images, or observations of a set of variables. Data are often viewed as the lowest level of abstraction from which...
, as opposed to news stories.
Communication scholar Wiebke Loosen defines database journalism as "supplying databases
Database
A database is an organized collection of data for one or more purposes, usually in digital form. The data are typically organized to model relevant aspects of reality , in a way that supports processes requiring this information...
with raw material - articles, photos and other content - by using medium-agnostic publishing systems
Content management system
A content management system is a system providing a collection of procedures used to manage work flow in a collaborative environment. These procedures can be manual or computer-based...
and then making it available for different devices."
Some argue that such organization allows for a more efficient
Efficiency (economics)
In economics, the term economic efficiency refers to the use of resources so as to maximize the production of goods and services. An economic system is said to be more efficient than another if it can provide more goods and services for society without using more resources...
workflow
Workflow
A workflow consists of a sequence of connected steps. It is a depiction of a sequence of operations, declared as work of a person, a group of persons, an organization of staff, or one or more simple or complex mechanisms. Workflow may be seen as any abstraction of real work...
. Reginald Chua, Editor of data and innovation at Thomson Reuters
Thomson Reuters
Thomson Reuters Corporation is a provider of information for the world's businesses and professionals and is created by the Thomson Corporation's purchase of Reuters Group on 17 April 2008. Thomson Reuters is headquartered at 3 Times Square, New York City, USA...
, talk of structured journalism as a means to "maximize the shelf-life of news content" and "extracting more value" out of content.
History and development of database journalism
Computer programmer Adrian HolovatyAdrian Holovaty
Adrian Holovaty is an American Web developer, journalist and entrepreneur living in Chicago, Illinois. He is co-creator of the Django Web framework and an advocate of "journalism via computer programming."...
wrote what is now considered the manifesto of database journalism in September 2006. In this article, Holovaty explained that most material collected by journalists is "structured information: the type of information that can be sliced-and-diced, in an automated fashion, by computers". For him, a key difference between database journalism and traditional journalism is that the latter produces articles
Article (publishing)
An article is a written work published in a print or electronic medium. It may be for the purpose of propagating the news, research results, academic analysis or debate.-News articles:...
as the final product while the former produces databases of facts that are continually maintained and improved.
2007 saw a rapid development in database journalism. A December 2007 investigation by The Washington Post
The Washington Post
The Washington Post is Washington, D.C.'s largest newspaper and its oldest still-existing paper, founded in 1877. Located in the capital of the United States, The Post has a particular emphasis on national politics. D.C., Maryland, and Virginia editions are printed for daily circulation...
(Fixing DC's schools) aggregated dozens of items about over 135 schools in a database that distributed content on a map, on individual webpages or within articles.
The importance of database journalism was highlighted when the Knight Foundation
John S. and James L. Knight Foundation
The John S. and James L. Knight Foundation is an American private, non-profit foundation dedicated to supporting transformational ideas that promote quality journalism, advance media innovation, engage communities and foster the arts....
awarded $1,100,000 to Adrian Holovaty's EveryBlock project, which offers local news at the level of city block
City block
A city block, urban block or simply block is a central element of urban planning and urban design. A city block is the smallest area that is surrounded by streets. City blocks are the space for buildings within the street pattern of a city, they form the basic unit of a city's urban fabric...
, drawing from existing data. The Pulitzer prize received by the St. Petersburg Times
St. Petersburg Times
The St. Petersburg Times is a United States newspaper. It is one of two major publications serving the Tampa Bay Area, the other being The Tampa Tribune, which the Times tops in both circulation and readership. Based in St...
' Politifact in April 2009 has been considered as a Color of Money moment by Aron Pilhofer, head of the New York Times
The New York Times
The New York Times is an American daily newspaper founded and continuously published in New York City since 1851. The New York Times has won 106 Pulitzer Prizes, the most of any news organization...
technology team, hinting that database journalism has been accepted by the trade and will develop, much like CAR did in the early 1990s.
Seeing journalistic content as data has pushed several news organizations to release APIs, including the BBC
BBC
The British Broadcasting Corporation is a British public service broadcaster. Its headquarters is at Broadcasting House in the City of Westminster, London. It is the largest broadcaster in the world, with about 23,000 staff...
, the Guardian
The Guardian
The Guardian, formerly known as The Manchester Guardian , is a British national daily newspaper in the Berliner format...
, the New York Times
The New York Times
The New York Times is an American daily newspaper founded and continuously published in New York City since 1851. The New York Times has won 106 Pulitzer Prizes, the most of any news organization...
and the American National Public Radio. By doing so, they let others aggregate the data they have collected and organized. In other words, they acknowledge that the core of their activity is not story-writing, but data gathering and data distribution.
Beginning with the early years of the 21st century, some researchers expanded the conceptual dimension for databases in journalism, and in digital journalism or cyberjournalism. A conceptual approach begins to consider databases as a specificity of digital journalism, expanding their meaning and identifying them with a specific code, as opposed to the approach which perceived them as sources for the production of journalistic stories, that is, as tools, according to some of the systematized studies in the 90s.
Difference with data-driven journalism
Data-driven journalism is a processSystems engineering process
A systems engineering process is a process for applying systems engineering techniques to the development of all kinds of systems. Systems engineering processes are related to the stages in a system life cycle...
whereby journalists build stories using numerical data or databases as a primary material. In contrast, database journalism is an organizational structure
Organizational structure
An organizational structure consists of activities such as task allocation, coordination and supervision, which are directed towards the achievement of organizational aims. It can also be considered as the viewing glass or perspective through which individuals see their organization and its...
for content. It focuses on the constitution and maintenance of the database
Database
A database is an organized collection of data for one or more purposes, usually in digital form. The data are typically organized to model relevant aspects of reality , in a way that supports processes requiring this information...
upon which web
Web application
A web application is an application that is accessed over a network such as the Internet or an intranet. The term may also mean a computer software application that is coded in a browser-supported language and reliant on a common web browser to render the application executable.Web applications are...
or mobile applications can be build, and from which journalists can extract data to carry out data-driven stories.
Examples of database journalism
Early projects in this new database journalism were mySocietyMySociety
mySociety is an e-democracy project of the UK-based registered charity named UK Citizens Online Democracy, that aims to build "socially focussed tools with offline impacts". It was founded by Tom Steinberg in September 2003, and started activity after receiving a £250,000 grant in September 2004...
in the UK, launched in 2004, and Adrian Holovaty
Adrian Holovaty
Adrian Holovaty is an American Web developer, journalist and entrepreneur living in Chicago, Illinois. He is co-creator of the Django Web framework and an advocate of "journalism via computer programming."...
's chicagocrime.org, released in 2005.
As of 2011, several databases could be considered journalistic in themselves. They include EveryBlock, OpenCorporates
OpenCorporates
OpenCorporates is a website which shares data on corporate entities as open data under the share-alike attribution Open Database Licence. It was created by Chris Taggart and Rob McKinnon and launched on 20 December 2010...
, VoteWatch.eu.