Hearing on Supercomputing: Is the U.S. on the Right Path?
Dr. Peter A. Freeman
Good morning, Mr. Chairman and members of the Committee. I am Dr. Peter Freeman, Assistant Director of the NSF for CISE.
I am delighted to have the opportunity to testify before you this morning and to discuss the topic "Supercomputing: Is the U.S. on the Right Path?" Supercomputers are an extremely important national resource in many sectors of our society, and for decades, these resources have yielded astounding scientific breakthroughs. Supercomputing is a field that NSF has championed and supported for many years and it is one in which we will continue to lead the way.
There seems to be some confusion in the scientific community as to NSF's commitment to High-End Computing (HEC) which is the current term being used for "supercomputing." I want to clear the air this morning. Before I briefly summarize my written testimony, let me state unequivocally that NSF remains absolutely committed to providing researchers the most advanced computing equipment available and to sponsoring research that will help create future generations of computational infrastructure, including supercomputers.
At the same time, we are committed to realizing the compelling vision described in the report of the NSF Advisory Panel on Cyberinfrastructure, commonly known as the Atkins Committee - that "a new age has dawned in scientific and engineering research, pushed by continuing progress in computing, information and communications technology." This cyberinfrastructure includes, and I quote, "not only high-performance computational services, but also integrated services for knowledge management, observation and measurement, visualization and collaboration."
The scientific opportunities that lie before us in many fields can only be realized with such a cyberinfrastructure. Just as supercomputing promised to revolutionize the conduct of science and engineering research several decades ago, and we are seeing the results of that promise today, so does an advanced cyberinfrastructure promise to revolutionize the conduct of science and engineering research and education in the 21st century. The opportunities that a balanced, state-of-the-art cyberinfrastructure promises must be exploited for the benefit of all of our citizens - for their continuing health, security, education, and wealth.
To be clear, we are committed to what the Atkins Report and many others in the community, both formally and informally, are saying: That NSF, in partnership with other public and private organizations, must make investments in the creation, deployment and application of cyberinfrastructure in ways that radically empower all science and engineering research and allied education... thereby empowering what the Atkins Report defines as "a revolution."
Cyberinfrastructure - with HEC as an essential component - can bring about this true revolution in science and engineering. It promises great advances for all of the areas of our society served by science and engineering, but it will be realized only if we stay focused on the value of all components of cyberinfrastructure.
Supercomputers have been one of the main drivers in this revolution up to now because of the continuing evolution of computing technology. Computers were initially developed to deal with pressing, numerical computations and they will continue to be extremely important. In recent years, however, many scientific advances have been enabled not only by computers, but by the great expansion in the capacity of computer storage devices and communication networks, coupled now with rapidly improving sensors.
There are now many examples of revolutionary scientific advances that can only be brought about by utilizing other components of cyberinfrastructure in combination with HEC. This necessary convergence means that we must maintain a broad discourse about cyberinfrastructure. As we set about building and deploying an advanced cyberinfrastructure, we will ensure that the HEC portion remains an extremely important component in it.
History of NSF Support for Supercomputing
NSF has been in the business of supporting high performance computation in the form of centers since the establishment of the first Academic Computing Centers in the 1960's. As computers became increasingly powerful, they were later designated to be "supercomputers." In the mid-1980's, NSF created the first supercomputing centers for the open science community. A decade later, support was established through the Partnerships for Advanced Computational Infrastructure (PACI) program. In 2000, a parallel activity that is now known as the Extensible Terascale Facility was initiated. (There is much more to this history that is available if desired). Beginning in FY 2005, support will be provided through cyberinfrastructure program(s) currently under development.
Over time, technological innovations have led to movement away from the use of the term "supercomputing centers" since it inadequately describes the full measure and promise of what is being done at such centers, and what is at stake. The idea of the "supercomputer" lives on as a legacy, however a more accurate title for this kind of infrastructure would be High-performance Information Technology (HIT) Centers.
NSF currently supports three major HIT Centers: the San Diego Supercomputer Center (SDSC), the National Center for Supercomputing Applications (NCSA), and the Pittsburgh Supercomputer Center (PSC). These centers have been participating in this enterprise since the days of the first supercomputer centers in the early 1980's. They have evolved steadily over the past quarter century and now represent special centers of talent and capability for the science community broadly and the nation at large.
In the last six years, NSF's support for these HIT Centers has been provided predominantly through the PACI program. More recently, a new consortium of HIT Centers has emerged around the new grid-enabled concept of the Extensible Terascale Facility (ETF).
In order to describe NSF's current activities in the area of supercomputing, I'd like to respond directly to the following questions formulated by Chairman Boehlert and the Committee on Science. (Note: Italicized and numbered statements are drawn verbatim from Chairman Boehlert's letter of invitation.)
- 1. Some researchers within the computer science community have suggested that the NSF may be reducing its commitment to the supercomputer centers. Is this the case?
- In its report, the Panel recommended "a 2-year extension of the current PACI co-operative agreements". The National Science Board approved the second 1-year extension of the PACI cooperative agreements at the May 2003 meeting.
- The Panel also recommended that "...the new separately peer-reviewed enabling and application infrastructure would begin in 2004 or 2005, after the 2-year extensions of the current cooperative agreements." The President requested $20 million for NSF in FY 2004 for activities that will focus on the development of cyberinfrastructure, including enabling infrastructure (also known as enabling technology). This increased investment in enabling technology will strengthen the agency's portfolio of existing awards, and as the Panel recommended, awards will be identified through merit-review competition.
- Finally, the Panel's Report recommends that "After these two years, until the end of the original 10-year lifetime of the PACI program, the panel believes that" NCSA, SDSC and PSC "should continue to be assured of stable, protected funding to provide the highest-end computing resources."
NSF is most definitely not reducing its commitment to supercomputing.
For several decades the agency has invested millions of taxpayer dollars in the development and deployment of a high-end computational infrastructure. These resources are made widely available to the science and engineering research and education community. The agency is not reducing its commitment to such efforts. In fact, leading-edge supercomputing capabilities are an essential component in the cyberinfrastructure and, in line with the recommendations of the Advisory Panel on Cyberinfrastructure, we are committed to expanding such capabilities in the future.
1. (cont'd) To what extent does the focus on grid computing represent a move away from providing researchers with access to the most advanced computing equipment?
The term "grid computing" is ambiguous and often misused. It sometimes is used to signify a single computational facility composed of widely separated elements (nodes) that are interconnected by high-performance networks and that are operating in a manner that the user sees a single "computer." This is a special case of the more general concept of a set of widely separated computational resources of different types, which can be accessed and utilized as their particular capabilities are needed.
While still experimental at this stage, grid computing promises to become the dominant modality of High-performance IT (and, eventually, of commodity computing). One need only think about the World Wide Web (WWW) to understand the compelling importance of grid computing. In the WWW one is able from a single terminal (typically a desk-top PC) to access many different databases, on-line services, even computational engines today. Imagine now that the nodes that are accessible in this way are HECs with all of their computational power; or massive data stores that can be manipulated and analyzed; or sophisticated scientific instruments to acquire data; or any of a dozen other foreseen and unforeseen tools. With this vision, perhaps one can understand the promise of grid computing.
NSF's recent investments in grid computing, through the ETF, should not be seen as a reduction in the agency's commitment to HEC. Rather, it underscores the importance of HEC integrated into a broad cyberinfrastructure. Indeed, the first increment of ETF funding was for a HEC machine, which at the time it was installed at the PSC, was the most powerful open access machine in the research world (and it is, 3 years later, still number 9 on the Top500 list.). This machine is one of the main resources on the ETF grid. While NSF may not always have the world's fastest machine, we will continue to provide a range of supercomputing systems that serve the ever increasing and changing needs of science and engineering.
At the same time, the ETF investment in grid computing is not the ONLY investment the agency has recently made in HEC. In fact, HEC upgrades at NCSA and SDSC during FY 2003 and 2004 are expected to fund the acquisition of an additional 20 Teraflops of HEC capability.
NSF's unwavering commitment is to continuously advance the frontier, and grid computing is widely acknowledged to represent the next frontier in computing. In short, the ETF represents our commitment to innovation at the computing frontier.
2. What are the National Science Foundation's (NSF's) plans for funding the supercomputer centers beyond fiscal year 2004? To what extent will you be guided by the recommendation of the NSF Advisory Panel on Cyberinfrastructure to maintain the Partnerships for Advanced Computational Infrastructure, which currently support the supercomputer centers?
NSF's plans for funding supercomputer centers beyond FY 2004 are very much guided by the recommendations of the NSF Advisory Panel on Cyberinfrastructure as described below.
In addition, NSF will also increase its investments in applications infrastructure (also known as applications technology) by drawing upon interdisciplinary funds available in the FY 2004 ITR priority area activity. Again, the most promising proposals will be identified using NSF's rigorous merit review process.
Accordingly, and in planning how to provide such support while positioning the community to realize the promise of cyberinfrastructure, NSF has held a series of workshops and town hall meetings over the course of the past two months to gather community input. Informed by community input, support for SDSC, NCSA and PSC will be provided through new awards to be made effective the beginning of FY 2005.
NSF has also committed to providing support for the management and operations of the Extensible Terascale Facility through FY 2009; this includes support for SDSC, NCSA and PSC who are partners in the ETF.
3. To what extent will NSF be guided by the recommendations of the High-End Computing Revitalization Task Force? How will NSF contribute to the Office of Science and Technology Policy plan to revitalize high-end computing?
NSF has been an active participant in all four of the subgroups of the OSTP current planning activity called HEC-RTF. The final report is not finished, but we continue to coordinate closely at all levels with OSTP and its sister agencies to make sure that the recommendations of the Task Force are carried forward in a productive and effective manner. NSF's traditional role as a leader in innovating new HEC computational mechanisms and applications, and in ensuring that there are appropriate educational programs in place to train scientists and engineers to use them, must be integral to any efforts to increase HEC capabilities for the Nation.
We are now planning our request for the FY 2005 budget, and cyberinfrastructure, including HEC, is likely to be a major component in it. We intend to continue to have more than one high-end machine for the NSF community to access and to invest in needed HEC capabilities as noted above.
4. To what extent are the advanced computational needs of the scientific community and of the private sector diverging? What is the impact of any such divergence on the advanced computing programs at NSF?
I don't believe that the advanced computational "needs" of the science community and the private sector are diverging. In fact, I believe that the growing scientific use of massive amounts of data parallels what some sectors of industry already know. In terms of HEC, it is clear that both communities need significantly faster machines to address those problems that can only be solved by massive computations.
For several decades, NSF has encouraged its academic partners in supercomputing, including NCSA, SDSC and PSC, to develop strong relationships with industry. The initial emphasis was on "supercomputing". And many of the industrial partners at the centers learned about supercomputing in this way, and then started their own internal supercomputer centers.
When Mosaic (the precursor to Netscape) was developed at NCSA, industry was able to rapidly learn about and exploit this new, revolutionary technology. As noted above, grid computing, which is being innovated at NCSA, SDSC and PSC and their partners today, is already being picked up by industry as a promising approach to tomorrow's computing problems.
As researchers push the boundaries in their work, the results (and means of obtaining those results) are often quickly picked up by industry. Conversely, in some areas industry has had to tackle problems (such as data mining) first and now those techniques are becoming useful in scientific research. We intend to continue the close collaboration that has existed for many years.
Mr. Chairman, I hope that this testimony dispels any doubt about NSF's commitment to HEC and the HIT Centers that today provide significant value to the science and engineering community.
I hope that I have also been able to articulate that the cyberinfrastructure vision eloquently described in the Atkins report includes HEC and other advanced IT components. This cyberinfrastructure will enable a true revolution in science and engineering research and education that can bring unimagined benefits to our society.
NSF recognizes the importance of HEC to the advanced scientific computing infrastructure for the advancement of science and knowledge. We are committed to continuing investments in HEC and to developing new resources that will ensure that the United States maintains the best advanced computing facilities in the world. We look forward to working with you to ensure that these goals are fulfilled.
Thank you for the opportunity to appear before you this morning.