TeraGrid
Encyclopedia
TeraGrid is an e-Science
grid computing
infrastructure combining resources at eleven partner sites. The project started in 2001 and operated from 2004 through 2011.
The TeraGrid integrated high-performance computers, data resources and tools, and experimental facilities. Resources included more than a petaflop of computing capability and more than 30 petabytes of online and archival data storage, with rapid access and retrieval over high-performance networks computer network
connections. Researchers can also access more than 100 discipline-specific databases.
TeraGrid was coordinated through the Grid Infrastructure Group (GIG) at the University of Chicago
, working in partnership with the resource provider sites in the United States.
(NSF) issued a solicitation asking for a "distributed terascale facility" from program director Richard L. Hilderbrandt.
The TeraGrid project was launched in August 2001 with $53 million in funding to four sites: the National Center for Supercomputing Applications
(NCSA) at the University of Illinois at Urbana-Champaign
, the San Diego Supercomputer Center
(SDSC) at the University of California, San Diego
, University of Chicago
Argonne National Laboratory
, and Center for Advanced Computing Research (CACR) at the California Institute of Technology
in Pasadena, California
.
The design was meant to be an extensible distributed open system
from the start.
In October 2002, the Pittsburgh Supercomputing Center
(PSC) at Carnegie Mellon University
and the University of Pittsburgh
joined the TeraGrid as major new partners when NSF announced $35 million in supplementary funding. The TeraGrid network was transformed through the ETF project from a 4-site mesh
to a dual-hub backbone network
with connection points in Los Angeles
and at the Starlight facilities in Chicago
.
In October 2003, NSF awarded $10 million to add four sites to TeraGrid as well as to establish a third network hub, in Atlanta. These new sites were Oak Ridge National Laboratory
(ORNL), Purdue University
, Indiana University
, and the Texas Advanced Computing Center
(TACC) at The University of Texas at Austin
.
TeraGrid construction was also made possible through corporate partnerships with Sun Microsystems
, IBM
, Intel Corporation
, Qwest Communications, Juniper Networks
, Myricom
, Hewlett-Packard
Company, and Oracle Corporation
.
TeraGrid construction was completed in October 2004, at which time the TeraGrid facility began full production.
extended support for another five years with a $150 million set of awards. It included $48 million for coordination and user support to the Grid Infrastructure Group at the University of Chicago
led by Charlie Catlett
.
Using high-performance network connections, the TeraGrid featured high-performance computers, data resources and tools, and high-end experimental facilities around the USA. The work supported by the project is sometimes called e-Science
.
In 2006, the University of Michigan
's School of Information began a study of TeraGrid.
In May 2007, TeraGrid integrated resources included more than 250 teraflops of computing capability and more than 30 petabytes (quadrillions of bytes) of online and archival data storage with rapid access and retrieval over high-performance networks. Researchers could access more than 100 discipline-specific databases. In late 2009, The TeraGrid resources had grown to 2 petaflops of computing capability and more than 60 petabytes storage. In mid 2009, NSF extended the operation of TeraGrid to 2011.
In July 2011, a partnership of 17 institutions announced the Extreme Science and Engineering Discovery Environment (XSEDE). NSF announced funding the XSEDE project for five years, at $121 million.
The new group will be led by John Towns at the University of Illinois's National Center for Supercomputing Applications
.
in that each resource provides a "service" that is defined in terms of interface and operation. Computational resources run a set of software packages called "Coordinated TeraGrid Software and Services" (CTSS). CTSS provides a familiar user environment on all TeraGrid systems, allowing scientists to more easily port code from one system to another. CTSS also provides integrative functions such as single-signon, remote job submission, workflow support, data movement tools, etc. CTSS includes the Globus Toolkit
, Condor, distributed accounting and account management software, verification and validation software, and a set of compilers, programming tools, and environment variable
s.
TeraGrid uses a 10 Gigabits per second dedicated fiber-optical backbone network, with hubs in Chicago, Denver, and Los Angeles. All resource provider sites connect to a backbone node at 10 Gigabits per second. Users accessed the facility through national research networks such as the Internet2
Abilene backbone
and National LambdaRail
.
Each of these discipline categories correspond to a specific program area of the National Science Foundation
.
Starting in 2006, TeraGrid provided application-specific services to Science Gateway partners, who serve (generally via a web portal) discipline-specific scientific and education communities. Through the Science Gateways program TeraGrid aims to broaden access by at least an order of magnitude in terms of the number of scientists, students, and educators who are able to use TeraGrid.
E-Science
E-Science is computationally intensive science that is carried out in highly distributed network environments, or science that uses immense data sets that require grid computing; the term sometimes includes technologies that enable distributed collaboration, such as the Access Grid...
grid computing
Grid computing
Grid computing is a term referring to the combination of computer resources from multiple administrative domains to reach a common goal. The grid can be thought of as a distributed system with non-interactive workloads that involve a large number of files...
infrastructure combining resources at eleven partner sites. The project started in 2001 and operated from 2004 through 2011.
The TeraGrid integrated high-performance computers, data resources and tools, and experimental facilities. Resources included more than a petaflop of computing capability and more than 30 petabytes of online and archival data storage, with rapid access and retrieval over high-performance networks computer network
Computer network
A computer network, often simply referred to as a network, is a collection of hardware components and computers interconnected by communication channels that allow sharing of resources and information....
connections. Researchers can also access more than 100 discipline-specific databases.
TeraGrid was coordinated through the Grid Infrastructure Group (GIG) at the University of Chicago
University of Chicago
The University of Chicago is a private research university in Chicago, Illinois, USA. It was founded by the American Baptist Education Society with a donation from oil magnate and philanthropist John D. Rockefeller and incorporated in 1890...
, working in partnership with the resource provider sites in the United States.
History
The US National Science FoundationNational Science Foundation
The National Science Foundation is a United States government agency that supports fundamental research and education in all the non-medical fields of science and engineering. Its medical counterpart is the National Institutes of Health...
(NSF) issued a solicitation asking for a "distributed terascale facility" from program director Richard L. Hilderbrandt.
The TeraGrid project was launched in August 2001 with $53 million in funding to four sites: the National Center for Supercomputing Applications
National Center for Supercomputing Applications
The National Center for Supercomputing Applications is an American state-federal partnership to develop and deploy national-scale cyberinfrastructure that advances science and engineering. NCSA operates as a unit of the University of Illinois at Urbana-Champaign but it provides high-performance...
(NCSA) at the University of Illinois at Urbana-Champaign
University of Illinois at Urbana-Champaign
The University of Illinois at Urbana–Champaign is a large public research-intensive university in the state of Illinois, United States. It is the flagship campus of the University of Illinois system...
, the San Diego Supercomputer Center
San Diego Supercomputer Center
The San Diego Supercomputer Center is an organized research unit of the University of California, San Diego . Physically, SDSC is located on the east end of Eleanor Roosevelt College on the campus of UCSD....
(SDSC) at the University of California, San Diego
University of California, San Diego
The University of California, San Diego, commonly known as UCSD or UC San Diego, is a public research university located in the La Jolla neighborhood of San Diego, California, United States...
, University of Chicago
University of Chicago
The University of Chicago is a private research university in Chicago, Illinois, USA. It was founded by the American Baptist Education Society with a donation from oil magnate and philanthropist John D. Rockefeller and incorporated in 1890...
Argonne National Laboratory
Argonne National Laboratory
Argonne National Laboratory is the first science and engineering research national laboratory in the United States, receiving this designation on July 1, 1946. It is the largest national laboratory by size and scope in the Midwest...
, and Center for Advanced Computing Research (CACR) at the California Institute of Technology
California Institute of Technology
The California Institute of Technology is a private research university located in Pasadena, California, United States. Caltech has six academic divisions with strong emphases on science and engineering...
in Pasadena, California
Pasadena, California
Pasadena is a city in Los Angeles County, California, United States. Although famous for hosting the annual Rose Bowl football game and Tournament of Roses Parade, Pasadena is the home to many scientific and cultural institutions, including the California Institute of Technology , the Jet...
.
The design was meant to be an extensible distributed open system
Open system (computing)
Open systems are computer systems that provide some combination of interoperability, portability, and open software standards. The term was popularized in the early 1980s, mainly to describe systems based on Unix,...
from the start.
In October 2002, the Pittsburgh Supercomputing Center
Pittsburgh Supercomputing Center
The Pittsburgh Supercomputing Center is a high performance computing and networking center. PSC is a joint effort of Carnegie Mellon University and the University of Pittsburgh together with Westinghouse Electric Company in Pittsburgh, Pennsylvania, United States. It was founded in 1986 by...
(PSC) at Carnegie Mellon University
Carnegie Mellon University
Carnegie Mellon University is a private research university in Pittsburgh, Pennsylvania, United States....
and the University of Pittsburgh
University of Pittsburgh
The University of Pittsburgh, commonly referred to as Pitt, is a state-related research university located in Pittsburgh, Pennsylvania, United States. Founded as Pittsburgh Academy in 1787 on what was then the American frontier, Pitt is one of the oldest continuously chartered institutions of...
joined the TeraGrid as major new partners when NSF announced $35 million in supplementary funding. The TeraGrid network was transformed through the ETF project from a 4-site mesh
Mesh networking
Mesh networking is a type of networking where each node must not only capture and disseminate its own data, but also serve as a relay for other nodes, that is, it must collaborate to propagate the data in the network....
to a dual-hub backbone network
Backbone network
A backbone network or network backbone is a part of computer network infrastructure that interconnects various pieces of network, providing a path for the exchange of information between different LANs or subnetworks. A backbone can tie together diverse networks in the same building, in different...
with connection points in Los Angeles
Los Ángeles
Los Ángeles is the capital of the province of Biobío, in the commune of the same name, in Region VIII , in the center-south of Chile. It is located between the Laja and Biobío rivers. The population is 123,445 inhabitants...
and at the Starlight facilities in Chicago
Chicago
Chicago is the largest city in the US state of Illinois. With nearly 2.7 million residents, it is the most populous city in the Midwestern United States and the third most populous in the US, after New York City and Los Angeles...
.
In October 2003, NSF awarded $10 million to add four sites to TeraGrid as well as to establish a third network hub, in Atlanta. These new sites were Oak Ridge National Laboratory
Oak Ridge National Laboratory
Oak Ridge National Laboratory is a multiprogram science and technology national laboratory managed for the United States Department of Energy by UT-Battelle. ORNL is the DOE's largest science and energy laboratory. ORNL is located in Oak Ridge, Tennessee, near Knoxville...
(ORNL), Purdue University
Purdue University
Purdue University, located in West Lafayette, Indiana, U.S., is the flagship university of the six-campus Purdue University system. Purdue was founded on May 6, 1869, as a land-grant university when the Indiana General Assembly, taking advantage of the Morrill Act, accepted a donation of land and...
, Indiana University
Indiana University Bloomington
Indiana University Bloomington is a public research university located in Bloomington, Indiana, in the United States. IU Bloomington is the flagship campus of the Indiana University system. Being the flagship campus, IU Bloomington is often referred to simply as IU or Indiana...
, and the Texas Advanced Computing Center
Texas Advanced Computing Center
The Texas Advanced Computing Center at the University of Texas at Austin, United States, is a research center for advanced computational science, engineering and technology. TACC is located on UT's J.J. Pickle Research Campus....
(TACC) at The University of Texas at Austin
University of Texas at Austin
The University of Texas at Austin is a state research university located in Austin, Texas, USA, and is the flagship institution of the The University of Texas System. Founded in 1883, its campus is located approximately from the Texas State Capitol in Austin...
.
TeraGrid construction was also made possible through corporate partnerships with Sun Microsystems
Sun Microsystems
Sun Microsystems, Inc. was a company that sold :computers, computer components, :computer software, and :information technology services. Sun was founded on February 24, 1982...
, IBM
IBM
International Business Machines Corporation or IBM is an American multinational technology and consulting corporation headquartered in Armonk, New York, United States. IBM manufactures and sells computer hardware and software, and it offers infrastructure, hosting and consulting services in areas...
, Intel Corporation
Intel Corporation
Intel Corporation is an American multinational semiconductor chip maker corporation headquartered in Santa Clara, California, United States and the world's largest semiconductor chip maker, based on revenue. It is the inventor of the x86 series of microprocessors, the processors found in most...
, Qwest Communications, Juniper Networks
Juniper Networks
Juniper Networks is an information technology and computer networking products multinational company, founded in 1996. It is head quartered in Sunnyvale, California, USA. The company designs and sells high-performance Internet Protocol network products and services...
, Myricom
Myrinet
Myrinet, ANSI/VITA 26-1998, is a high-speed local area networking system designed by Myricom to be used as an interconnect between multiple machines to form computer clusters. Myrinet has much lower protocol overhead than standards such as Ethernet, and therefore provides better throughput, less...
, Hewlett-Packard
Hewlett-Packard
Hewlett-Packard Company or HP is an American multinational information technology corporation headquartered in Palo Alto, California, USA that provides products, technologies, softwares, solutions and services to consumers, small- and medium-sized businesses and large enterprises, including...
Company, and Oracle Corporation
Oracle Corporation
Oracle Corporation is an American multinational computer technology corporation that specializes in developing and marketing hardware systems and enterprise software products – particularly database management systems...
.
TeraGrid construction was completed in October 2004, at which time the TeraGrid facility began full production.
Operation
In August 2005, NSF's newly created office of cyberinfrastructureCyberinfrastructure
United States federal research funders use the term cyberinfrastructure to describe research environments that support advanced data acquisition, data storage, data management, data integration, data mining, data visualization and other computing and information processing services distributed over...
extended support for another five years with a $150 million set of awards. It included $48 million for coordination and user support to the Grid Infrastructure Group at the University of Chicago
University of Chicago
The University of Chicago is a private research university in Chicago, Illinois, USA. It was founded by the American Baptist Education Society with a donation from oil magnate and philanthropist John D. Rockefeller and incorporated in 1890...
led by Charlie Catlett
Charlie Catlett
Charlie Catlett is a Senior Computer Scientist at Argonne National Laboratory and a Senior Fellow in the , a joint institute of Argonne National Laboratory and The University of Chicago. From 2007-2011 he was Chief Information Officer and director of the Computing and Information Systems Division...
.
Using high-performance network connections, the TeraGrid featured high-performance computers, data resources and tools, and high-end experimental facilities around the USA. The work supported by the project is sometimes called e-Science
E-Science
E-Science is computationally intensive science that is carried out in highly distributed network environments, or science that uses immense data sets that require grid computing; the term sometimes includes technologies that enable distributed collaboration, such as the Access Grid...
.
In 2006, the University of Michigan
University of Michigan
The University of Michigan is a public research university located in Ann Arbor, Michigan in the United States. It is the state's oldest university and the flagship campus of the University of Michigan...
's School of Information began a study of TeraGrid.
In May 2007, TeraGrid integrated resources included more than 250 teraflops of computing capability and more than 30 petabytes (quadrillions of bytes) of online and archival data storage with rapid access and retrieval over high-performance networks. Researchers could access more than 100 discipline-specific databases. In late 2009, The TeraGrid resources had grown to 2 petaflops of computing capability and more than 60 petabytes storage. In mid 2009, NSF extended the operation of TeraGrid to 2011.
Transition to XSEDE
A follow-on project was approved in May 2011.In July 2011, a partnership of 17 institutions announced the Extreme Science and Engineering Discovery Environment (XSEDE). NSF announced funding the XSEDE project for five years, at $121 million.
The new group will be led by John Towns at the University of Illinois's National Center for Supercomputing Applications
National Center for Supercomputing Applications
The National Center for Supercomputing Applications is an American state-federal partnership to develop and deploy national-scale cyberinfrastructure that advances science and engineering. NCSA operates as a unit of the University of Illinois at Urbana-Champaign but it provides high-performance...
.
Architecture
TeraGrid resources are integrated through a service-oriented architectureService-oriented architecture
In software engineering, a Service-Oriented Architecture is a set of principles and methodologies for designing and developing software in the form of interoperable services. These services are well-defined business functionalities that are built as software components that can be reused for...
in that each resource provides a "service" that is defined in terms of interface and operation. Computational resources run a set of software packages called "Coordinated TeraGrid Software and Services" (CTSS). CTSS provides a familiar user environment on all TeraGrid systems, allowing scientists to more easily port code from one system to another. CTSS also provides integrative functions such as single-signon, remote job submission, workflow support, data movement tools, etc. CTSS includes the Globus Toolkit
Globus Toolkit
The Globus Toolkit, currently at version 5, is an open source toolkit for building computing grids developed and provided by the Globus Alliance.-Standards implementation:The Globus Toolkit is an implementation of the following standards:...
, Condor, distributed accounting and account management software, verification and validation software, and a set of compilers, programming tools, and environment variable
Environment variable
Environment variables are a set of dynamic named values that can affect the way running processes will behave on a computer.They can be said in some sense to create the operating environment in which a process runs...
s.
TeraGrid uses a 10 Gigabits per second dedicated fiber-optical backbone network, with hubs in Chicago, Denver, and Los Angeles. All resource provider sites connect to a backbone node at 10 Gigabits per second. Users accessed the facility through national research networks such as the Internet2
Internet2
Internet2 is an advanced not-for-profit US networking consortium led by members from the research and education communities, industry, and government....
Abilene backbone
Abilene Network
Abilene Network was a high-performance backbone network created by the Internet2 community in the late 1990s. In 2007 the Abilene Network was retired and upgraded network was known as the "Internet2 Network".-History:...
and National LambdaRail
National LambdaRail
National LambdaRail is a , high-speed national network infrastructure owned and operated by the U.S. research and education community that runs over fiber-optic lines, and is the first transcontinental 10-Gigabit Ethernet network...
.
Usage
TeraGrid users primarily came from U.S. universities. There are roughly 4,000 users at over 200 universities. Academic researchers in the United States can obtain exploratory, or development allocations (roughly, in "CPU hours") based on an abstract describing the work to be done. More extensive allocations involve a proposal that is reviewed during a quarterly peer-review process. All allocation proposals are handled through the TeraGrid website. Proposers select a scientific discipline that most closely describes their work, and this enables reporting on the allocation of, and use of, TeraGrid by scientific discipline. As of July 2006 the scientific profile of TeraGrid allocations and usage was:Allocated (%) | Used (%) | Scientific Discipline |
---|---|---|
19 | 23 | Molecular Biosciences |
17 | 23 | Physics |
14 | 10 | Astronomical Sciences |
12 | 21 | Chemistry |
10 | 4 | Materials Research |
8 | 6 | Chemical, Thermal Systems |
7 | 7 | Atmospheric Sciences |
3 | 2 | Advanced Scientific Computing |
2 | 0.5 | Earth Sciences |
2 | 0.5 | Biological and Critical Systems |
1 | 0.5 | Ocean Sciences |
1 | 0.5 | Cross-Disciplinary Activities |
1 | 0.5 | Computer and Computation Research |
0.5 | 0.25 | Integrative Biology and Neuroscience |
0.5 | 0.25 | Mechanical and Structural Systems |
0.5 | 0.25 | Mathematical Sciences |
0.5 | 0.25 | Electrical and Communication Systems |
0.5 | 0.25 | Design and Manufacturing Systems |
0.5 | 0.25 | Environmental Biology |
Each of these discipline categories correspond to a specific program area of the National Science Foundation
National Science Foundation
The National Science Foundation is a United States government agency that supports fundamental research and education in all the non-medical fields of science and engineering. Its medical counterpart is the National Institutes of Health...
.
Starting in 2006, TeraGrid provided application-specific services to Science Gateway partners, who serve (generally via a web portal) discipline-specific scientific and education communities. Through the Science Gateways program TeraGrid aims to broaden access by at least an order of magnitude in terms of the number of scientists, students, and educators who are able to use TeraGrid.
Resource providers
- Argonne National LaboratoryArgonne National LaboratoryArgonne National Laboratory is the first science and engineering research national laboratory in the United States, receiving this designation on July 1, 1946. It is the largest national laboratory by size and scope in the Midwest...
(ANL) operated by the University of ChicagoUniversity of ChicagoThe University of Chicago is a private research university in Chicago, Illinois, USA. It was founded by the American Baptist Education Society with a donation from oil magnate and philanthropist John D. Rockefeller and incorporated in 1890...
and the Department of EnergyUnited States Department of EnergyThe United States Department of Energy is a Cabinet-level department of the United States government concerned with the United States' policies regarding energy and safety in handling nuclear material... - Indiana UniversityIndiana UniversityIndiana University is a multi-campus public university system in the state of Indiana, United States. Indiana University has a combined student body of more than 100,000 students, including approximately 42,000 students enrolled at the Indiana University Bloomington campus and approximately 37,000...
- Louisiana Optical Network Initiative (LONI)
- National Center for Atmospheric ResearchNational Center for Atmospheric ResearchThe National Center for Atmospheric Research has multiple facilities, including the I. M. Pei-designed Mesa Laboratory headquarters in Boulder, Colorado. NCAR is managed by the nonprofit University Corporation for Atmospheric Research and sponsored by the National Science Foundation...
(NCAR) - National Center for Supercomputing ApplicationsNational Center for Supercomputing ApplicationsThe National Center for Supercomputing Applications is an American state-federal partnership to develop and deploy national-scale cyberinfrastructure that advances science and engineering. NCSA operates as a unit of the University of Illinois at Urbana-Champaign but it provides high-performance...
(NCSA) - National Institute for Computational SciencesNational Institute for Computational SciencesThe National Institute for Computational Sciences is funded by the National Science Foundation and managed by the University of Tennessee. NICS is home to Kraken, the most powerful computer in the world managed by academia and the world's fourth overall most powerful supercomputer...
(NICS) operated by University of TennesseeUniversity of TennesseeThe University of Tennessee is a public land-grant university headquartered at Knoxville, Tennessee, United States...
at Oak Ridge National LaboratoryOak Ridge National LaboratoryOak Ridge National Laboratory is a multiprogram science and technology national laboratory managed for the United States Department of Energy by UT-Battelle. ORNL is the DOE's largest science and energy laboratory. ORNL is located in Oak Ridge, Tennessee, near Knoxville...
. - Oak Ridge National LaboratoryOak Ridge National LaboratoryOak Ridge National Laboratory is a multiprogram science and technology national laboratory managed for the United States Department of Energy by UT-Battelle. ORNL is the DOE's largest science and energy laboratory. ORNL is located in Oak Ridge, Tennessee, near Knoxville...
(ORNL) - Pittsburgh Supercomputing CenterPittsburgh Supercomputing CenterThe Pittsburgh Supercomputing Center is a high performance computing and networking center. PSC is a joint effort of Carnegie Mellon University and the University of Pittsburgh together with Westinghouse Electric Company in Pittsburgh, Pennsylvania, United States. It was founded in 1986 by...
(PSC) operated by University of PittsburghUniversity of PittsburghThe University of Pittsburgh, commonly referred to as Pitt, is a state-related research university located in Pittsburgh, Pennsylvania, United States. Founded as Pittsburgh Academy in 1787 on what was then the American frontier, Pitt is one of the oldest continuously chartered institutions of...
and Carnegie Mellon UniversityCarnegie Mellon UniversityCarnegie Mellon University is a private research university in Pittsburgh, Pennsylvania, United States....
. - Purdue UniversityPurdue UniversityPurdue University, located in West Lafayette, Indiana, U.S., is the flagship university of the six-campus Purdue University system. Purdue was founded on May 6, 1869, as a land-grant university when the Indiana General Assembly, taking advantage of the Morrill Act, accepted a donation of land and...
- San Diego Supercomputer CenterSan Diego Supercomputer CenterThe San Diego Supercomputer Center is an organized research unit of the University of California, San Diego . Physically, SDSC is located on the east end of Eleanor Roosevelt College on the campus of UCSD....
(SDSC) - Texas Advanced Computing CenterTexas Advanced Computing CenterThe Texas Advanced Computing Center at the University of Texas at Austin, United States, is a research center for advanced computational science, engineering and technology. TACC is located on UT's J.J. Pickle Research Campus....
(TACC)
Similar projects
- Distributed European Infrastructure for Supercomputing Applications (DEISA), integrating eleven European supercomputing centers
- Enabling Grids for E-sciencE (EGEE)
- National Research Grid Initiative (NAREGEGI) involving several supercomputer centers in JapanSupercomputing in JapanJapan operates a number of centers for supercomputing which hold world records in speed, with the K computer becoming the world's fastest in June 2011....
from 2003 - Open Science Grid - a distributed computing infrastructure for scientific research
- Extreme Science and Engineering Discovery Environment (XSEDE) - the TeraGrid successor