text-only page produced automatically by LIFT Text Transcoder Skip all navigation and go to page contentSkip top navigation and go to directorate navigationSkip top navigation and go to page navigation
National Science Foundation
A Grand Convergence
 
Essential, Not Optional
 
Discovery, Learning and Leadership
 
Classroom Resources
 
 
 
Image of Mercury, the first TeraGrid cluster. Click for larger image.

Mercury, the first TeraGrid cluster, deployed at the National Center for Supercomputing Applications (NCSA), runs on Intel's Itanium architecture. The 512-processor cluster has a peak performance of 2.7 teraflops (trillions of calculations per second).

Credit: NCSA


From Supercomputing to the TeraGrid

Image of Cray Supercomputer,  Click for larger image.
Credit: NCSA, University of Illinois at Urbana-Champaign

This history points out some of the landmarks along the route the National Science Foundation (NSF) has traveled to bring high-performance computing to the nation’s researchers.

The Early History: 1960s-1980s.NSF’s investment in the nation's computational infrastructure began modestly in the 1960s, when NSF funded a number of campus computing centers. That support was short-lived, however, and by the early 1980s, several reports from the scientific community noted a dramatic lack of advanced computing resources available to researchers at American universities. The most influential of these was a joint agency study edited by Peter Lax and released in December 1982.

The Lax Report led to the emergence of significant new NSF support for high-end computing, which in turn led directly to Supercomputer Centers.

The Supercomputer Centers: 1985-1997. NSF established five of these centers in 1985 and 1986:

For the next 12 years, these centers would continue to serve as cornerstones of the nation's high-performance computing and communications strategy. On the one hand, they helped push the limits of advanced computing hardware and software, even as they provided supercomputer access to a broad cross-section of academic researchers regardless of discipline or funding agency. And on the other, the centers were instrumental in advancing network infrastructure. In 1986, the centers and the NSF-supported National Center for Atmospheric Research (NCAR) in Colorado became the first nodes on the NSFNET backbone. From 1989 to 1995, the Illinois, Pittsburgh and San Diego centers helped push the frontiers of high-speed networking as participants in the then-bleeding-edge Gigabit Network Testbed Projects, which were supported by NSF and the Defense Advanced Research Projects Agency (DARPA). In 1995, after NSFNET was decommissioned, the centers became the first nodes on NSF's very-high-performance Backbone Network Service (vBNS) for research and education. (See “A Brief History of NSF and the Internet” and "Networking for Tomorrow” for more details.)

In 1990, following a review of the supercomputer centers program, NSF extended support for CTC, NCSA, PSC and SDSC through 1995. In 1994, that support was extended again for another two years, through 1997, while a task force chaired by Edward Hayes considered the future of the program.

Out of the recommendations of the Hayes Report came a new program designed to build on and replace the centers—

Partnerships for Advanced Computational Infrastructure: 1997-2004. The National Science Board announced the two PACI awardees in March 1997:

  • The National Computational Science Alliance: a consortium led by NCSA, with participation by Partners for Advanced Computational Services at Boston University, the University of Kentucky, the Ohio Supercomputer Center, the University of New Mexico, and the University of Wisconsin.
  • The National Partnership for Advanced Computational Infrastructure (NPACI): a consortium led by SDSC, with participation by mid-range computing centers at Caltech, the University of Michigan, and the Texas Advanced Computing Center at the University of Texas at Austin.

In addition to leading-edge and mid-range sites, the PACI partnerships involved nearly 100 sites across the country in efforts to make more efficient use of high-end computing in all areas of science and engineering. The partnerships also collaborated on the Education, Outreach and Training (EOT) PACI.

The Alliance and NPACI continued to give academic researchers access to the most powerful computing resources available. These resources included the first academic teraflops system--a computer capable of 1 trillion operations per second--and some of the first large-scale Linux clusters for academia. At the same time, the partnerships were instrumental in fostering the maturation of grid computing and its widespread adoption by the scientific community and industry. Grid computing connects separate computing resources in order to apply their collective power to solve computationally intensive problems.

The PACI partners were involved in virtually every major grid-computing initiative, from the Grid Physics Network to the National Virtual Observatory to the George E. Brown, Jr. Network for Earthquake Engineering Simulation. The PACI partners were also driving forces in recognizing the critical scientific importance of and the technical challenges in accessing massive data collections. Following the sunset of the PACI program, NSF also continued core support for NCSA and SDSC to make more large-scale HPC resources available and to stimulate the expansion of cyberinfrastructure capabilities for the nation's scientists and engineers.

Terascale Initiatives: 2000-2004. In response to a 1999 report by the President's Information Technology Advisory Committee, NSF embarked on a series of "terascale" initiatives to acquire: (1) computers capable of trillions of operations per second (teraflops); (2) disk-based storage systems with capacities measured in trillions of bytes (terabytes); and (3) networks with bandwidths of billions of bits (gigabits) per second.

In 2000, the $36 million Terascale Computing System award to PSC supported the deployment of a computer (named LeMieux) capable of six trillion operations per second. When LeMieux went online in 2001, it was the most powerful U.S. system committed to general academic research. Five years later, it remains a highly productive system.

In 2001, NSF awarded $45 million to NCSA, SDSC, Argonne National Laboratory, and the Center for Advanced Computing Research (CACR) at California Institute of Technology, to establish a Distributed Terascale Facility (DTF). Aptly named the TeraGrid, this multi-year effort aimed to build and deploy the world's largest, fastest, most comprehensive, distributed infrastructure for general scientific research.

The initial TeraGrid specifications included computers capable of performing 11.6 teraflops, disk-storage systems with capacities of more than 450 terabytes of data, visualization systems, data collection, integrated via grid middleware and linked through a 40-gigabits-per-second optical network.

In 2002, NSF made a $35 million Extensible Terascale Facility (ETF) award to expand the initial TeraGrid to include PSC and integrate PSC's LeMieux system. Resources in the ETF give the national research community more than 20 teraflops of computing power distributed among the five sites and nearly one petabyte (one quadrillion bytes) of disk storage capacity.

To further expand the TeraGrid's capabilities, NSF made three Terascale Extensions awards totaling $10 million in 2003. The new awards funded high-speed networking connections to link the TeraGrid with resources at Indiana and Purdue Universities, Oak Ridge National Laboratory, and the Texas Advanced Computing Center at The University of Texas, Austin. Through these awards, the TeraGrid put neutron-scattering instruments, large data collections and other unique resources, as well as additional computing and visualization resources, within reach of the nation's research and education community.

In 2004, as a culmination of the DTF and ETF programs, the TeraGrid entered full production mode, providing coordinated, comprehensive services for general U.S. academic research.

Illustration of TeraGrid Extensible Terascale Facility,  Click for larger image.
Credit: Nicolle Rager-Fuller, National Science Foundation

The TeraGrid: 2005-2010. In August 2005, NSF's newly created Office of Cyberinfrastructure extended support for the TeraGrid with a $150 million set of awards for operation, user support and enhancement of the TeraGrid facility over the next five years. Using high-performance network connections, the TeraGrid now integrates high-performance computers, data resources and tools, and high-end experimental facilities around the country. In early 2006, these integrated resources included more than 102 teraflops of computing capability and more than 15 petabytes (quadrillions of bytes) of online and archival data storage with rapid access and retrieval over high-performance networks. Through the TeraGrid, researchers can access more than 100 discipline-specific databases. With this combination of resources, the TeraGrid is the world's largest, most comprehensive distributed cyberinfrastructure for open scientific research.

Related Websites:
Lax Report (1982): http://www.pnl.gov/scales/archives.stm
Hayes Report (1995): http://www.nsf.gov/pubsys/ods/getpub.cfm?nsf9646
TeraGrid: http://teragrid.org/
NSF Office of Cyberinfrastructure: http://www.nsf.gov/dir/index.jsp?org=OCI
National Center for Atmospheric Research: http://www.ncar.ucar.edu/
Grid Physics Network: http://www.griphyn.org/
National Virtual Observatory: http://www.us-vo.org/

Cyberinfrastructure A Special Report