Title : NSB 93-205 -- NSF Blue Ribbon Panel on High Performance Computing Type : Report NSF Org: CISE Date : October 19, 1993 File : nsb93205 From Desktop To Teraflop: Exploiting the U.S. Lead in High Performance Computing NSF Blue Ribbon Panel on High Performance Computing August 1993 Lewis Branscomb (Chairman) Theodore Belytschko Peter Bridenbaugh Teresa Chay Jeff Dozier Gary S. Grest Edward F. Hayes Barry Honig Neal Lane (resigned from Panel July 1993) William A. Lester, Jr. Gregory J. McRae James A. Sethian Burton Smith Mary Vernon "It is easier to invent the future than to predict it" - Alan Kay Dedication This report is dedicated to one of the nationþs most distinguished computer scientists, a builder of important academic institutions, and a devoted and effective public servant; Professor Nico Habermann. Dr. Habermann took responsibility in organizing this Panel's work and saw it through to completion, but passed away just a few days before it was presented to the National Science Board. The members of the panel deeply feel the loss of his creativity, wisdom, and friendship. EXECUTIVE SUMMARY An Introductory Remark: Many reports are prepared for the National Science Board and the National Science Foundation that make an eloquent case for more resources for one discipline or another. This is not such a report. This report addresses an opportunity to accelerate progress in virtually every branch of science and engineering concurrently, while also giving a shot in the arm to the entire American economy as business firms also learn to exploit these new capabilities. The way much of science and engineering are practiced will be transformed, if our recommendations are implemented. The National Science Board can take pride in the Foundation's accomplishments in the decade since it implemented the recommendations of the Peter Lax Report on high performance computing (HPC). The Foundation's High Performance Computing Centers continue to play a central role in this successful strategy, creating an enthusiastic and demanding set of sophisticated users, who have acquired the specialized computational skills required to use the fast advancing but still immature high performance computing technology. Stimulated by this growing user community, the HPC industry finds itself in a state of excitement and transition. The very success of the NSF program, together with those of sister agencies, has given rise to a growing variety of new experimental computing environments, from massively parallel systems to networks of coupled workstations, that could, with the right research investments, produce entirely new levels of computing power, economy, and usability. The U.S. enjoys a substantial lead in computational science and in the emerging technology; it is urgent that the NSF capitalize on this lead, which not only offers scientific preeminence but also the industrial lead in a growing world market. The vision of the rapid advances in both science and technology that the new generation of supercomputers could make possible has been shown to be realistic. This very success, measured in terms of new discoveries, the thousands of researchers and engineers who have gained experience in HPC, and the extraordinary technical progress in realizing new computing environments, creates its own challenges. We invite the Board to consider four such challenges: Challenge 1: How can NSF, as the nation's premier agency funding basic research, remove existing barriers to the rapid evolution of high performance computing, making it truly usable by all the nation's scientists and engineers? These barriers are of two kinds: technological barriers (primarily to realizing the promise of highly parallel machines, workstations, and networks) and implementation barriers (new mathematical methods and new ways to formulate science and engineering problems for efficient and effective computation). An aggressive commitment by NSF to leadership in research and prototype development, in both computer science and in computational science, will be required. Challenge 2: How can NSF provide scalable access to a pyramid of computing resources, from the high performance workstations needed by most scientists to the critically needed teraflop-and- beyond capability required for solving Grand Challenge problems? What balance of among high performance desktop workstations, vs. mid-range or mini-supercomputer, vs. networks of workstations, vs. remote, shared supercomputers of very high performance should NSF anticipate and encourage? Challenge 3: The third challenge is to encourage the continued broadening of the base of participation in HPC, both in terms of institutions and in terms of skill levels and disciplines. This calls for expanded education and training, and participation by state-based and other HPC institutions. Challenge 4: How can NSF best create the intellectual and management leadership for the future of high performance computing in the U.S.? What role should NSF play within the scope of the nationally coordinated HPCC program? What relationships should NSF's activities in HPC have to the activities of other federal agencies? This report recommends significant expansion in NSF investments, both in accelerating progress in high performance computing through computer and computational science research and in providing the balanced pyramid of computing facilities to the science and engineering communities. The cost estimates are only approximate, but in total they do not exceed the Administration's stated intent to double the investments in HPCC during the next 5 years. We believe these investments are not only justified but are compatible with stated national plans, both in absolute amount and in their distribution. RECOMMENDATIONS: We have four sets of interdependent recommendations. The first implements a balanced pyramid of computing environments (see Figure A following this Summary). Each element in the pyramid supports the others; whatever resources are applied to the whole, the balance in the pyramid should be sustained. The second set addresses the essential research investments and other steps to remove the obstacles to realizing the technologies in the pyramid and the barriers to the effective use of these environments. The third set addresses the institutional structure for delivery of HPC capabilities, and consists itself of a pyramid (see Figure B following this Summary), of which the NSF Centers are an important part. At the base of the institutional pyramid is the diverse array of investigators in their universities and other settings, who use all the facilities at all levels of the pyramid. At the next level are departments and research groups devoted to specific areas of computer science or computational science and engineering. At the next level are the NSF HPC Centers, which must continue to be providers of shared high capability computing systems and to provide aggregations of specialized capability for all aspects of use and advance of high performance computing. At the apex is the national teraflop-class facility, which we recommend as a multi-agency facility pushing the frontiers of high performance into the next decade. A final recommendation addresses the NSF's role at the national level and its relationship with the states in HPC. A. CENTRAL GOAL FOR NSF HPC POLICY Recommendation A-1: The National Science Board should take the lead, under OSTP guidance and in collaboration with ARPA, DoE and other agencies, to expand access to all levels of the dynamically evolving pyramid of high performance computing capability for all sectors of the whole nation. The realization of this pyramid depends, of course, on rapid progress in the pyramid's technologies. The computational capability we envision includes not only the research capability for which NSF has special stewardship, but also includes a rapid expansion of capability in business and industry to use HPC profitably, and many operational uses of HPC in commercial and military activities. VISION OF THE HPC PYRAMID Recommendation A-2: At the apex of the pyramid is the need for a national capability at the highest level of computing power the industry can support with both efficient software and hardware. A reasonable goal would be the design, development, and realization of a national teraflop-class capability, subject to the successful development of software and computational tools for such a large machine (recommendation B-1). NSF should initiate, through OSTP, an interagency plan to make this investment, anticipating multi-agency funding and usage. Recommendation A-3: Over a period of 5 years the research universities should be assisted to acquire mid-range machines. These mid-sized machines are the underfunded element of the pyramid today -- about 10% of NSF's FY92 HPC budget is devoted to their acquisition. They are needed for both demanding science and engineering problems that do not require the very maximum in computing capacity, and for use by the computer science and computational mathematics community in addressing the architectural, software, and algorithmic issues that are the primary barriers to progress with massively parallel processor architectures. Recommendation A-4: We recommend that NSF double the current annual level of investment ($22 million) providing scientific and engineering workstations to its 20,000 principal investigators. Within 4 or 5 years workstations delivering up to 400 megaflops costing no more than $15,000 to $20,000 should be widely available. For education and a large fraction of the computational needs of science and engineering, these facilities will be adequate. Recommendation A-5: We recommend that the NSF expand its New Technologies program to support expanded testing of the new parallel configurations for HPC applications. For example, the use of Gigabit local area networks to link workstations may meet a significant segment of mid-range HPC science and engineering applications. A significant supplement to HPC applications research capacity can be had with minimal additional cost if such collections of workstations prove practical and efficient. B. RECOMMENDATIONS TO IMPLEMENT THESE GOALS REMOVING BARRIERS TO HPC TECHNICAL PROGRESS AND HPC USAGE Recommendation B-1: To accelerate progress in developing the HPC technology needed by users, NSF should create, in the Directorate for Computer and Information Science and Engineering, a challenge program in computer science with grant size and equipment access sufficient to support the systems and algorithm research needed for more rapid progress in HPC capability. The Centers, in collaboration with hardware and software vendors, can provide test platforms for much of this work, and recommendation A-3 provides the hardware support required for initial development of prototypes. Recommendation B-2: A significant barrier to rapid progress in HPC application lies in the formulation of the computational strategy for solving a scientific or engineering problem. In response to Challenge 1, the NSF should focus attention, both through CISE and through its disciplinary program offices, on support for the design and development of computational techniques, algorithmic methodology, and mathematical, physical and engineering models to make efficient use of the machines. BALANCING THE PYRAMID OF HPC ACCESS Recommendation B-3: We recommend NSF set up a task force to develop a way to ameliorate the imbalance in the HPC "pyramid" -- the under-investment in the emerging mid-range scalable, parallel computers and the inequality of access to stand-alone (but potentially networked) workstations in the disciplines. This implementation plan should involve a combination of funding by disciplinary program offices and some form of more centralized allocation of NSF resources. C. THE NSF HPC CENTERS Recommendation C-1 : The Centers should be retained and their missions should be reaffirmed. However, the NSF HPC effort now embraces a variety of institutions and programs -- HPC Centers, Engineering Research Centers, and Science & Technology Centers devoted to HPC research, and disciplinary investments in computer and computational science and applied mathematics -- all of which are essential elements of the HPC effort needed for the next decade. Furthermore, HPC institutions outside the NSF orbit also contribute to the goals for which the NSF Centers are chartered. Thus we ask the Board to recognize that the overall structure of the HPC program at NSF will have more institutional diversity, more flexibility, and more interdependence with other agencies and private institutions than was possible in the early years of the HPC initiative. The NSF should continue its current practice of encouraging HPC Center collaboration, both with one another and with other entities engaged in HPC work. The division of the support budget into one component committed to the centers and another for multi-center activities is a useful management tool, even though it may have the effect of reducing competition among centers. The National Consortium for HPC (NCHPC), formed by NSF and ARPA is a welcome measure as well. Recommendation C-2 : The current situation in HPC is both more exciting, more turbulent, and more filled with promise of really big benefits to the nation than at any time since the Lax report; this is not the time to "sunset" a successful, changing venture, of which the Centers remain an important part. Furthermore, we also recommend against re-competition of the four Centers at this time, favoring periodic performance evaluation and competition for some elements of their activities, both among Centers and when appropriate with other HPC Centers such as those operated by states (see Recommendation D-1). Recommendation C-3 : The mission of the Centers is to foster rapid progress in the use of HPC by scientists and engineers, to accelerate progress in usability and economy of HPC and to diffuse HPC capability throughout the technical community, including industry. Provision to scientists and engineers of access to leading edge supercomputer resources will contine to be a primary purpose of the Centers. The following additional components of the Center missions should be affirmed: * Supporting computational science, by research and demonstration in the solution of significant science and engineering problems. * Fostering interdisciplinary collaboration -- across sciences and between sciences and computational science and computer science -- as in the Grand Challenge programs. * Prototyping and evaluating software, new architectures, and the uses of high speed data communications in collaboration with: computer and computational scientists, disciplinary scientists exploiting HPC resources, the HPC industry, and business firms exploring expanded use of HPC. * Training and education, from post-docs and faculty specialists to introduction of less experienced researchers to HPC methods, to collaboration with state and regional HPC centers working with high schools and community colleges. ALLOCATION OF CENTER HPC RESOURCES TO INVESTIGATORS Recommendation C-4: The NSF should continue to monitor the administrative procedures used to allocate Center resources, and the relationship of this process to the initial funding of the research by the disciplinary program offices, to ensure that the burden on scientists applying for research support is minimized. NSF should continue to provide HPC resources to the research community through allocation committees that evaluate competitively proposals for use of Center resources. EDUCATION AND TRAINING Recommendation C-5: The NSF should give strong emphasis to its education mission in HPC, and should actively seek collaboration with state-sponsored and other HPC centers not supported primarily on NSF funding. Supercomputing regional affiliates should be candidates for NSF support, with education as a key role. HPC will also figure in the Administration's industrial extension program, in which the states have the primary operational role. D. NSF AND THE NATIONAL HPC EFFORT; RELATIONSHIPS WITH THE STATES Recommendation D-1: We recommend that NSF urge OSTP to establish an advisory committee representing the states, HPC users, NSF Centers, computer manufacturers, computer and computational scientists (similar to the Federal Networking Councilþs Advisory Committee), which should report to HPCCIT. A particularly important role for this body would be to facilitate state-federal planning related to high performance computing. Teraflop class Center supercomputers Mid-range parallel processors; Networked work stations High performance workstations Figure A PYRAMID OF HIGH PERFORMANCE COMPUTING ENVIRONMENTS National Teraflop facility NSF HPC Centers Other agency, State Centers Departments, institutes, laboratories Subject specific, Computer Science, Computational Science and Engineering Groups Individual investigators and small groups Figure B PYRAMID OF HIGH PERFORMANCE COMPUTING INSTITUTIONS INTRODUCTION AND BACKGROUND A revolution is underway in the practice of science and engineering, arising from advances in computational science and new models for scientific phenomena, and made possible by advances in computer science and technology. The importance of this revolution is not yet fully appreciated because of the limited fraction of the technical community that has developed the skills required and has access to high performance computational resources. These skill and access barriers can be dramatically lowered, and if they are, a new level of creativity and progress in science and engineering may be realized which will be quite different from that known in the past. This report is about that opportunity for all of science and engineering; it is not about the needs of one or two specialized disciplines. A little over a decade ago, the National Science Board convened a panel chaired by Prof. Peter Lax to explore what should NSF do to exploit the potential for science and industry of the rapid advances in high performance computing./1 The actions taken by the NSF with the encouragement of the Board to implement the "Large Scale Computing in Science and Engineering" Report of 1982 have helped computing foster a revolution in science and engineering research and practice, in academic institutions and to a lesser extent in industrial applications. At the time, centralized facilities were the only way to provide access to high performance computing, which compelled the Lax panel to recommend the establishment of NSF Supercomputer Centers interconnected by a high speed network. The new revolution is characterized both by advances in the power of supercomputers and by the diffusion throughout the nation of access to and experience with using high performance computing./2 This success has opened up a vast set of new research and applications problems amenable to solution through high levels of computational power and better computational tools. ---------- 1/Report of the Panel on Large Scale Computing in Science and Engineering, Peter Lax, chairman, commissioned by the National Science Board in cooperation with the U.S. Department of Defense, Department of Energy, and the National Aeronautics and Space Administration, December 26, 1982. 2/With every new generation of computing machines, the capability associated with "high performance computing" changes. High performance computing (HPC) may be defined as "a computation and communications capability that allows individuals and groups to extend their ability to solve research, design, and modelling problems substantially beyond that available to them before." This definition recognizes that HPC is a relative and changing concept. For the PC user a scientific workstation is high performance computing. For the technical people with specialized skill in computational science and access to high performance facilities, a reasonable level for 1992-1993 might be 1 Gflop for a vector machine and 2 Gflops for a MPP system. The key features of the new capabilities include: * The power of the big, multiprocessing vector supercomputers, today's workhorse of supercomputing, has increased by a factor of 100 to 200 since the Lax Report./3 ---------- 3/As noted in Appendix C, the clock speed of a single vector processor has only increased by a factor of 5 to 6 since 1976, but a 16-way Cray C-90 with one additional vector pipe multiplies the effective speed by the estimated factor of a hundred or more. * An exciting array of massively parallel processors (MPP) have appeared in the market, offering three possibilities: an acceleration in the rate of advance of peak processing power, an improvement in the ratio of performance to cost, and the option to grow the power of an installation incrementally as the need arises./4 ---------- 4/The promise (not yet realized) of massively parallel systems is a much higher degree of installed capacity expandability with minimal disruption to the user's programming. * Switched networks based on high speed digital communications are extending access to major computational facilities, permitting the dynamic redeployment of computing power to suit the users' needs, and improving connectivity among collaborating users. * Technical progress in computer science and microelectronics have transformed yesterday's supercomputers into today's emerging desktop workstations. These workstations offer more flexible tradeoffs between ease of access and inherent computing power and can be coupled to the largest supercomputers over a national network, used in locally-networked clusters, or as stand-alone processors. * Advances in computer architectures, computational mathematics, algorithmic modeling, and software, along with new computer architectures, are solving some of the most intractable but important scientific, technical, and economic problems facing our society. To address these changes, the National Science Board charged this panel with taking a fresh look at the current situation and new directions that might be required. (See Appendix A for institutional identification of the panel membership and Appendix B for historical background leading to the present study and the Charge to the Panel.) To provide both direction and potential to exploit these advances, a leadership role for the NSF continues to be required. The goal of this report is to suggest how NSF should evolve its role in high performance computing. Our belief that NSF can and should continue to exert influence in these fields is based in part on its past successes achieved through the NSF Program in High Performance Computing and Communications. Achievements Since the Lax Report In the past 10 years, the NSF Program in High Performance Computing and Communications has: * Facilitated many new scientific discoveries and new industrial processes, and supported fundamental work which has led to advances in architectures, tools and algorithms for computational science. In Appendix E of this report several panel members describe examples of those accomplishments and suggest their personal visions for what may be even more dramatic progress in the future. * Supported fundamental work in computer science and engineering which has led to advances in architectures, tools, and algorithms for computational science. * Initiated collaborations with many companies to help them realize the economic and technological benefits of high performance computing. Caterpillar Inc. uses supercomputing to model diesel engines in an attempt to reduce emissions. Dow Chemical Company simulates and visualizes fluid flow in chemical processes to ensure complete mixing. USX has turned to supercomputing to improve the hot rolling process-control systems used in steel manufacturing. Solar Turbine, Inc. applies computational finite-element methods to the design of very complex mechanical systems. * Opened up supercomputer access to a wide range of researchers and industrial scientists and engineers. This was one of the key recommendations of the Lax Report. The establishment of the four NSF Supercomputer Centers (in addition to NCAR) has been extraordinarily successful. By providing network access, through the NSFNET and Internet linkages, NSF has put these computing resources at the fingertips of scientists, engineers, mathematicians and other professionals all over the nation. Users seldom need to go personally to these Centers; in fact, the distribution of computational cycles by the four NSF Supercomputer Centers shows surprisingly little geographic bias. This extension of compute power, away from dedicated, on-site facilities and towards a seamless national computing environment has been instrumental in creating the conditions required for advances on a broad front in science, engineering, and the tools of computational science. There seems to be a lack of geographic bias in users - Figure 1 in Appendix D shows users widely distributed across the United States. * Educated literally thousands of scientists, engineers and students, as well as a new generation of researchers who now use computational science equally with theory and experiment. At the time of the Lax Report access to the most advanced facilities was restricted to a relatively small set of users. Furthermore supercomputing was regarded by many scientists as either an inaccessible tool or as an inelegantly brute force approach to science. The NSF program successfully inoculated virtually all of the disciplines with the realization that HPC is both a powerful and a practical tool for many purposes. These NSF initiatives have not only pushed the technology and computational science ahead in sophistication and power, they have helped bring high performance computing to a large fraction of the technical community. There has been a 5-fold increase in number of NSF funded scientists using HPC and a 5-fold increase in ratio of graduate students to faculty using HPC through the NSF Supercomputer Centers. (See Figure 2 of Appendix D) * Provided the HPC industry a committed, enthusiastic, and dedicated class of expert users who share their experience and ideas with vendors, accelerating the evolutionary improvement in the technology and its software. One of the problems in the migration of new technologies from experimental environments to production modes are the inherent risks in committing substantial resources towards converting existing codes and developing software tools. The NSF Supercomputer Centers have provided a proving ground for these new technologies; various industrial players have entered into partnerships with the Centers aimed at accelerating this migration while maintaining solid and reliable underpinnings. * Encouraged the Supercomputer Centers to leverage their relationship with HPC producers to reduce the cost of bringing innovation to the scientific and engineering communities. In recognition of Center activity in improving early versions of hardware and software for high performance computing systems, the computer industry has provided equipment at favorable prices and important technical support. This has allowed researchers earlier and more useful access to HPC facilities than might have been the case under commercial terms. * Joined into successful partnerships with other agencies to make coordinated contributions to the U.S. capability in HPC. A decade ago the United States enjoyed a world-wide commercial lead in vector systems. In part as the result of more recent development and procurement actions of the Advanced Projects Agency, the Department of Energy, and the National Science Foundation, the U.S. now has the dominant lead in providing new Massively Parallel Processing (MPP) systems./5 As an example, the NSF has enabled NSF Supercomputer Center acquisitions of scalable parallel systems first developed under seed money provided by ARPA, and thus has been instrumental in leveraging ARPA projects into the mainstream./6 (Figure 3 of Appendix D shows data on the uptake of advanced computing by sector across the world). ---------- 5/Massively parallel computers are constructed from large numbers of separate processors linked by high speed communications providing access to each other and to shared I/O devices and/or computer memory. There are many different architectural forms of MPP machines, but they have in common economies of scale from the use of microprocessors produced at high volumes and the ability to combine them at many levels of aggregation. The challenge in using such machines is to formulate the problem so that it can be decomposed and run efficiently on most or all of the processors concurrently. Some scientific problems lend themselves to parallel computation much more easily than others, suggesting that improved utility of MPP machines will not be availed in all fields of science at once. 6/Scalable parallel machines are those in which the number of processor nodes can be expanded over a wide range without substantial changes in either the shared hardware or the application interfaces of the operating system. The Lax Report All of these accomplishments have, in a large part, arisen from the response by NSF to the recommendations of the 1982 Lax report "Large Scale Computing in Science and Engineering". These recommendations included: * Increase access to regularly upgraded supercomputing facilities via high bandwidth networks. * Increase research in computational mathematics, software, and algorithms. * Train people in scientific computing. * Invest in research on new supercomputer systems. For several reasons, NSF's investment in computational research and training has been a startling success. First, there has been a widespread acceptance of computational science as a vital component and tool in scientific and technological understanding. Second, there have been revolutionary advances in computing technology in the past decade. And third, the demonstrated ability to solve key critical problems has advanced the progress of mathematics, science and engineering in many important ways, and has created great demand for additional HPC resources. The New Opportunities in Science and in Industry As discussed in detail in the Appendix E essays, the prospects are for dramatic progress in science and engineering and for rapid adoption of computational science in industry. The next major HPC revolution may well be in industry, which is still seriously under-utilizing HPC (with some exceptions such as aerospace, automotive, and microelectronics). The success of the chemical industry in designing and simulating pilot plants, of the aircraft industry in simulating wind tunnels and performing dynamic design evaluation, and in the electronics industry in designing integrated circuits and modelling the performance of computers and networks suggests the scale of available opportunities. The most important requirements are (a) improving the usability and efficiency problems of high performance machines, and (b) training in HPC for people going into industry. The Supercomputer Centers have demonstrated they can introduce the commercial sector to HPC at little cost, and with high potential benefits to economy (productivity of industry and stimulation of markets for U.S. HPC vendors). Success in stimulating HPC usage in industry will also accelerate need for HPC education and technology, thus exploiting the benefits of collaboration with universities and vendors. The Centersþ role can be a catalytic one, but often rises to the level of a true collaborative partnership with industry, to the mutual advantage of the firm and the NSF Centers. As industrial uses of HPC grow, the scientists, mathematicians, and engineers benefit from the falling costs and rising usability of the new equipment. In addition the technological uses of HPC spur new and interesting problems in science. The following chart indicates the increasing importance of advanced computing in industry. Cray Research Inc. supercomputer sales _________________________________________________________________ Era Percent to Percent to Percent to government industry Universities _________________________________________________________________ Early 1980s 70 25 5 Late 1980s 60 25 15 Today 40 40 20 _________________________________________________________________ The New Technology Most HPC production work being done today uses big vector machines in single processor (or loosely coupled multiprocessor) mode. Vectorizing Fortran compilers and other software tools are well tested and many people have been trained in their use. These big shared memory machines will continue to be the mainstay of high performance computing, at least for the next 5 years or so, and perhaps beyond if the promise of massively parallel supercomputing is delayed longer than many expect. New desk top computers have made extraordinary gains in cost- performance (driven by competition-driven commodity microprocessor production). Justin Rattner of Intel estimated that in 1996 microprocessors with clock speeds of 200 MHz may power an 800 Mflops peak speed workstation./7 He, and others from the industry, predicted the convergence of the clock speeds of microprocessor chips and the large vector machines such as the Cray C90, perhaps as soon as 1995. They held out the likelihood that in 1997 microprocessors may be available at 1 gigaflop; a desktop PC might be available with this speed for $10,000 or less. Mid-range workstations will also show great growth in capacity; Today one can purchase a mid-range workstation with a clock speed of 200MHz for an entry price of $40,000 to $50,000. ---------- 7/The instruction execution speeds of scientific computers are generally reckoned in the number of floating point instructions that can be executed in one second. Thus a 1 Megaflop machine executes 1 million floating point instructions per second, a Gigaflop would be one billion instructions per second, and a Teraflop 10[superscript 12] floating point instructions per second. Since different computer architectures may have quite different instruction sets one "flop" may not be the same as another, either in application power or in the number of machine cycles required. To avoid such difficulties, those who want to compare machines of different architecture generally use a benchmark suite of test cases to measure overall performance on each machine. Thus a technical transition is underway from the world in which uniprocessor supercomputers were distinguished from desktop machines by having much faster cycle times, to a world in which cycle times converge and the highest levels of computer power will be delivered through parallelism, memory size and bandwidth, and I/O speed. The widespread availability of scientific workstations will accelerate the introduction of more scientists and engineers to high performance computing, resulting in a further acceleration of the need for higher performance machines. Early exploration of message-passing distributed operating systems gives promise of loosely-coupled arrays of workstations being used to process large problems in the background and when the workstations are unused at night, as well as coupling the workstations (on which problems are initially designed and tested) to the supercomputers located at remote facilities. Of course, the faster microprocessors also make possible new MPP machines of ever increasing peak processing speed. MPP is catching on fast, as researchers with sufficient expertise (and diligence) in computational science are solving a growing number of applications that lend themselves to highly parallel architectures. In some cases those investigators are realizing a ratio of theoretical to peak performance approaching that achieved by vector machines, with significant cost-performance advantages. Efficient use of MPP on the broad range of scientific and engineering problems is still beyond the reach of most investigators, however, because of the expertise and effort required. Thus the first speculative phase of MPP HPC is coming to an end, but its ultimate potential is still uncertain and largely unrealized. Limiting progress in all three of these technologies is a set of architecture and software issues that are discussed below in Recommendations B. Principal among them is the evolution of a programming model that can allow portability of applications software across architectures. These technical issues are discussed at greater length in Appendix C. FOUR CHALLENGES FOR NSF High performance computing is changing very fast, and NSF policy must chase a moving target. For that reason, the strategy adopted must be agile and flexible in order to capitalize on past investments and adapt to the emerging opportunities. The Board and the Foundation face four central challenges, on which we will make specific recommendations for policy and action. These challenges are: * Removing barriers to the rapid evolution of HPC capability * Providing scalable access to all levels of HPC capability * Finding the right incentives to promote access to all three levels of the computational power pyramid * Creating NSFþs intellectual and management leadership for the future of high performance computing in the U.S. CHALLENGE NO. 1: Removing Barriers to the Rapid Evolution of HPC Capability How can NSF, as the nation's premier agency funding basic research, remove existing barriers to the rapid evolution of High Performance Computing? These barriers are of two kinds: technological barriers (primarily to realizing the promise of highly parallel machines, workstations, and networks) and exploitation barriers (new mathematical methods and new ways to formulate science and engineering problems for efficient and effective computation). An aggressive commitment by NSF to leadership in research and prototype development, in both computer science and computational science, will be required. Indeed, NSF's position as the leading provider of HPC capability to the nation's scientists and engineers will be strengthened if it commands a leadership role in technical advances in both areas, which will contribute to the nation's economic position as well as its position as a world leader in research. Computer Science and Engineering. The first challenge is to accelerate the development of the technology underlying high performance computing. Among the largest barriers to effective use of the emerging HPC technologies are parallel architectures from which it is easy to extract peak performance, system software (operating systems, databases of massive size, compilers, and programming models) to take advantage of these architectures and provide portability of end-user applications, parallel algorithms, and advances in visualization techniques to aid in the interpretation of results. The technical barriers to progress are discussed in Appendix C. What steps will most effectively reduce these barriers? Computational Tools for Advancing Science and Engineering. Research in the development of computational models, the design of algorithmic techniques, and their accompanying mathematical and numerical analysis, is required in order to ensure the continued evolution of efficient and accurate computational algorithms designed to make optimal use of these emerging technologies. In the past ten years, exciting developments in computer architectures, hardware and software have come in tandem with stunning breakthroughs in computational techniques, mathematical analysis, and scientific models. For example, the potential of parallel machines has been realized in part through new versions of numerical linear algebra routines and multi-grid techniques; rethinking and reformulating algorithms for computational physics within the domain of parallel machines has posed significant and challenging research questions. Advances in such areas of N-body solvers, fast special function techniques, wavelets, high resolution fluid solvers, adaptive mesh techniques, and approximation theory have generated highly sophisticated algorithms to handle complex problems. At the same time, important theoretical advances in the modelling of underlying physical and engineering problems have led to new, efficient and accurate discretization techniques. Indeed, in the evolution to scalable computing across a range of levels, designing appropriate numerical and computational techniques is of paramount importance. The challenge facing NSF is to weave together existing work in these areas, as well as fostering new bridges between pure, applied and computational techniques, engaging the talents of disciplinary scientists, engineers, and mathematicians. CHALLENGE NO. 2: Providing scalable access to all levels of HPC capability How can NSF provide scalable access to computing resources, from the high performance workstations needed by most scientists to the critically needed teraflop-and-beyond capability required for solving Grand Challenge problems?/8 What balance should NSF anticipate and encourage among high performance desktop workstations, mid-range or mini-supercomputers, networks of workstations, and remote, shared supercomputers of very high performance? ---------- 8/By scalable access we mean the ability to develop a problem on a workstation or intermediate sized machine and migrate the problem with relative efficiency to larger machines as increased complexity requires it. Scalable access implies scalable architectures and software. Flexible strategy. NSF must ensure that adequate additional computational capacity is available to a steadily growing user community to solve the next generation of more complex science and engineering problems. A flexible and responsive strategy that can support the large number of evolving options for HPC and can adapt to the outcomes of major current development efforts (for example in MPP systems and in networked workstations) is required. A pyramid of computational capability. There will continue to be an available spectrum spanning almost five orders of magnitude of computer capabilities and prices./9 NSF, as a leader in the national effort in high performance computing, should support a "pyramid" of computing capability. At the apex of the pyramid is the highest performance systems that affordable technology permits, established at national facilities. At the next level, every major research university should have access to one, or a few, intermediate-scale high-performance systems and/or aggregated workstation clusters./10 At the lowest level are workstations with visualization capabilities in sufficient numbers to support computational scientists and engineers. ---------- 9/A Paragon machine of 300 Gigaflops peak performance would be five orders of magnitude faster than a 3 megaflop entry workstation. Effective performance in most science applications would, however, be perhaps a factor of ten lower. 10/As discussed in the recommendations, dedicated mid-range systems are required not only for science and engineering applications but also for research to improve HPC hardware and software, and for interactive usage. For science and engineering batch applications, networks of workstations will likely develop into an alternative. Mid-range computational requirements. Over the next five years, the middle range of scientific computing and computational engineering will be handled by an amazing variety of moderately parallel systems. In some cases, these will be scaled-down versions of the highest performance systems available; in other cases, they will be systems targeted at the midrange computing market. The architecture will vary from shared memory at one end of the spectrum to workstation networks at the other, depending on the types of parallelism in the local spectrum of applications. Loosely coupled networks of workstations will compete with mid-range systems for performance of production HPC work. At the same time autonomous mid-range systems are needed to support the development of next-generation architectures and software by computer science groups. The panel perceives that there are imbalances in access to the pyramid of HPC resources (see table below). The disciplinary NSF program offices have not been uniformly effective in responding to the need for a desktop environment for their supported researchers, and there is serious under-investment in the mid-sized machines. The distribution of investment tends to be bimodal, to the disadvantage of mid-range systems. The incentive structures internal to the Foundation do not address this distortion. NSFþs HPCC coordinating mechanism needs to address this distortion in a more direct manner. Computational Infrastructure at NSF (FY92 $, M) ________________________________________________________________ Other NSF ASC ---------------------------------------------------------------- Workstations 20.1 3.2 Small Parallel 2.1 0.5 Large Parallel 9.4 3.2 Mainframe 9.1 16.3 ---------------------------------------------------------------- Total 40.8 23.2 ---------------------------------------------------------------- CHALLENGE NO. 3: The right incentives to promote access to all three levels of the computational institution pyramid The third challenge is to encourage the continued broadening of the base of participation in HPC, both in terms of institutions and in terms of skill levels and disciplines. Lax Report incentives. At the time of the Lax report, relatively few people were interested in HPC; even fewer had access to supercomputers. Some users were fortunate to have contacts with someone at one of a few select government laboratories where computer resources were available. Most, however, were less fortunate and were forced to carry out their research on small departmental machines. This severely limited the research that could be carried out to problems that would "fit" into available resources. NSF addressed this problem by concentrating supercomputer resources in Centers; by this means those in the academic community most prepared and motivated were provided with access to machine cycles. Need for expanded scope of access. Now that these resources are available on a peer review basis to everyone no matter where they work, it is clear the research community cannot accept a return to the previous mode of operation.The high performance computing community has grown to depend on NSF to make the necessary resources available to continually upgrade the Supercomputer Centers in support of their computational science and engineering applications. NSF needs to broaden the base of participation in HPC through NSF program offices as well as through the Supercomputer Centers. There is no question that HPC has broken out of its original narrow group of privileged HPC specialists. The SuperQuest competition for high school students already demonstrates how quickly young people can master the effective use of HPC facilities. Other agencies, states, and private HPC centers are springing up, making major contributions not only to science but to K-12 education and to regional economies. NSF's policies on expanding access and training must take advantage of the leverage these Supercomputer Centers can provide. Allocation of HPC resources. There remains the question of the best way to allocate HPC resources. Should Supercomputer Centers continue to be funded to allocate HPC cycles competitively, or should NSF depend on the "market" of funded investigators for allocation of HPC resources? This question gets at two other issues: (a) the future role of the Centers and (b) the best means for insuring adequate funding of workstations and other means of HPC access throughout the NSF. The Centers have peer review committees which allocate HPC resources on the basis of competitive project selection. The Panel believes these allocations are fairly made and reflect solid professional evaluation of computational merit. The only remaining issue is whether there continues to be a need for protected funding for HPC access in NSF, including access to shared Supercomputer Centers facilities? We believe strongly that there is such a need. The panel does have suggestions for broadening the support for the remainder of the HPC pyramid; these are articulated in the recommendations below. Education and training. A major requirement for education and training continues to exist. Even though most disciplines have been inoculated with successful uses of HPC (see Appendix D essays), and even though graduate student and postdoctoral uses of HPC resources is rising faster than faculty usage, only a minority of scientists have the training to allow them to overcome the initial barrier to proficiency, especially in the use of MPP machines which require a high level of computational sophistication for most problems. CHALLENGE NO. 4: How can NSF best create the intellectual and management leadership for the future of high performance computing in the U.S.? What relationships should NSF's activities in HPC have to the activities of other federal agencies? NSF is a major player. What role should NSF play within the scope of the nationally coordinated HPCC program and budget, as indicated in the following chart? HPCC Agency Budgets _________________________________________________________________ Agency FY92 Funding ($, M) ----------------------------------------------------------------- ARPA 232.2 NSF 200.9 DOE 92.3 NASA 71.2 HHS/NIH 41.3 DOC/NOAA 9.8 EPA 5.0 DOC/NIST 2.1 _________________________________________________________________ NSF leadership in HPCC. The voice of HPCC users needs to be more effectively felt in the national program; NSF has the best contact with this community. NSF has played, and continues to play, a leadership role in the NREN program and the evolution of the Internet. Its initiative in creating the "meta-center" concept establishes an NSF role in the sharing and coordination of resources (not only in NSF but in other cooperating agencies as well), and the concept can be usefully extended to cooperating facilities at state level and in private firms. The question is, does the current structure in CISE, the HPCC coordination office, the Supercomputer Centers, and the science and engineering directorates constitute the most favorable arrangement for that leadership? The panel does not attempt to suggest the best ways to manage the relationships among these important functions, but asks the NSF leadership to assure the level of attention and coordination required to implement the broad goals of this report. Networking. The third barrier is the need for network access with adequate bandwidth. For wide area networks, this is addressed in the NSF HPCC NREN strategy. In the future, NSF will focus its network subsidies on HPC applications and their supporting infrastructure, while support for basic Internet connectivity shifts to the research and education institutions./11 ---------- 11/NREN is the National Research and Education Network, envisioned in the High Performance Computing Act of 1991. NREN is not a network so much as it is a program of activities including the evolution of the Internet to serve the needs of HPC as well as other information activities. RECOMMENDATIONS We have four sets of interdependent recommendations for the National Science Board and the Foundation. The first implements a balanced pyramid of computing environments; each element supports the others, and as priorities are applied the balance in the pyramid should be sustained. The second set addresses the essential research investments and other steps to remove the obstacles to realizing the technologies of the pyramid and the barriers to the effective use of these environments. The third set addresses the institutional structure for the delivery of HPC capabilities, and consists itself of a pyramid. At the base of the institutional pyramid is the diverse array of investigators in their universities and other settings who use all the facilities at all levels of the pyramid. At the next level are departments and research groups devoted to specific areas of computer science or computational science and engineering. Continuing upward are the NSF HPC Centers, which must continue to play a very important role, both as providers of the major resources of high capability computing systems and as aggregations of specialized capability for all aspects of use and advance of high performance computing. At the apex is the national teraflop facility, which we recommend as a multi-agency facility pushing the frontiers of high performance into the next decade. A final recommendation addresses the NSF's role at the national level and its relationship with the states in HPC. This report recommends significant expansion in NSF investments, both in accelerating progress in high performance computing through computer and computational science research and in providing the balanced pyramid of computing facilities to the science and engineering communities, but in total they do not exceed the Administration's stated intent to double the investments in HPCC during the next 5 years. We believe these investments are not only justified, but are compatible with stated national plans, both in absolute amount and in their distribution. A. CENTRAL GOAL FOR NSF HPC POLICY Recommendation A-1: We strongly recommend that NSF build on its success in helping the U.S. achieve its preeminent world position in high performance computing by taking the lead, under OSTP guidance and in collaboration with ARPA, DoE and other agencies, to expand access to all levels of the rapidly evolving pyramid of high performance computing for all sectors of the nation. The realization of this pyramid depends, of course, on rapid progress in the pyramidþs technologies. High performance computing is essential to the leading edge of U.S. research and development. It will provide the intelligence and power that justifies the breadth of connectivity and access promised by the NREN and the National Information Infrastructure. The computational capability we envision includes not only the research capability for which NSF has special stewardship, but also includes a rapid expansion of capability in business and industry to use HPC profitably and the many operational uses of HPC in commercial and military activities. The panel is concerned that if the government fails to implement the planned HPCC investments to support the National Information Infrastructure, the momentum of the U.S. industry, which blossomed in the first phase of the national effort, will be lost. Supercomputers are only a $2 billion industry, but an industry that provides critical tools for innovation across all areas of U.S. competitiveness, including pharmaceuticals, oil, aerospace, automotive, and others. The administration's planned new investment of $250 million in HPCC is fully justified. Japanese competitors could easily close the gap in the HPC sectors in which the U.S. enjoys that lead; they are continuing to invest and could capture much of the market the U.S. government has been helping to create. VISION OF THE HPC PYRAMID Recommendation A-2: At the apex of the HPC pyramid is a need for a national capability at the highest level of computing power the industry can support with both efficient software and hardware. A reasonable goal for the next 2-3 years would be the design, development, and realization of a national teraflop- class capability, subject to the effective implementation of Recommendation B-1 and the development of effective software and computational tools for such a large machine./12 Such a capability would provide a significant stimulus to commercial development of a prototype high-end commercial HPC system of the future. We believe the importance of NSF's mission in HPC justifies NSF initiating an interagency plan to make this investment, and further that NSF should propose to operate the facility in support of national goals in science and technology. For budgetary and interagency collaboration reasons OSTP should invoke a FCCSET project to establish such a capability on a government-wide basis with multi-agency funding and usage. If development begins in 1995 or 1996, a reasonable guess at the cost of a teraflop machine is $ 50/megaflop for delivery in 1997 to 1998. If so, $50 million a year might buy one such machine per year./13 Development cost would be substantial, perhaps in excess of the production cost of one machine; although it is not clear to what extent government support would be required, this is a further reason to suggest a multi-agency program./14 Support costs would also be additional, but one can assume that one or more of the NSF Supercomputer Centers could host such a facility with something like the current staff. ---------- 12/Some panel members have reservations about the urgency of this recommendation, are pessimistic about the likelihood of realizing the effective performance in applications, or are concerned about the possible opportunity cost to NSF of such a large project. The majority notes that the recommendation is intended to drive solutions to those architectural and software problems. Intel's Paragon machine is on the market today with 0.3 Teraflops peak speed, but without the support to deliver that speed in most applications. The panel also recommends a multi-agency federal effort. NSF's share of cost and role in managing such a project are left to a proposed FCCSET review. 13/The cost estimates in this report cannot be much more than informed guesses. We have assumed a cost of $50/megaflop for purchase of a one teraflop machine in 1997 or 1998. We suspect that this cost might be reached earlier, say in 1995 or 1996 in a mid-range machine, because a tightly-coupled massively parallel machine may have costs rising more than linearly with the number of processors, overcoming the scale economies that might make the cost rise less than linearly. The cost estimates in recommendations A2-4 are intended to indicate that scale of investment we recommend is not incompatible with the published plans of the administration for investment in HPCC in the next 5 years, and further that roughly equal levels of incremental expenditures in the three levels of the HPC pyramid could produce the balance among these levels that we recommend. 14/The Departments of Energy and Defense and NASA might share a major portion of the development cost and might also acquire such machines in the future as well. Such a nationally shared machine, or machines, must be open to competitive merit-evaluated proposals for science and engineering computation, although it could share this mission of responding to the community's research priorities with mission-directed work of the sponsoring agencies. The investment is justified by (a) the existence of problems whose solution awaits a teraflop machine, (b) the importance of driving the HPC industry's innovation rate, (c) the need for early and concrete experience with the scalability of software environments to higher speeds and larger arrays of processors, since software development time is the limiting factor to hardware acceptance in the market. Recommendation A-3: Over a period of 5 years the research universities should be assisted to acquire mid-range machines. This will bring a rapid expansion in access to very robust capability, reducing pressure on the Supercomputer Centers' largest facilities, and allowing the variety of vendor solutions to be exercised extensively. If the new MPP architectures prove robust, usable, and scalable, these institutions will be able to grow the capacity of such system in proportion to need and with whatever incremental resources are available. This capability is also needed to provide testbeds for computer and computational science research and testing. These mid-sized machines are the underfunded element today - - less than 5% of NSF's FY92 HPC budget is devoted to their acquisition. They are needed for both demanding science and engineering problems that do not require the very maximum in computing capacity, and importantly for use by the computer science and computational mathematics community in addressing the architectural, software, and algorithmic issues that are the primary barriers to progress with MPP architectures./15 ---------- 15/The development of prototypes of architectures and operating systems for parallel computation requires access to a machine whose hardware and software can be experimentally modified. This research often cannot be done on machines dedicated to full time production. Engineering is also a key candidate for their use. There are 1050 University-Industry Research Centers in the U.S. Those UIRCs that are properly equipped with computational facilities can increase the coupling with industrial computation, adding greatly to what the NSF HPC Supercomputer Centers are doing. Many engineering applications, such as robotics research, require "real time" interactive computation which is incompatible with the batch environment on the highest performance machines. If we assume a cost in three or four years of $50/megaflop for mid-sized MPP machines, an annual expenditure of $10 million would fund the annual acquisition of one hundred 2 Gigaflop (peak) computers. Support costs for users would be additional. Recommendation A-4: We recommend that NSF double the current annual level of investment ($22 million) in scientific and engineering workstations for its 20,000 principal investigators. Many researchers strongly prefer the new high performance workstations that are under their control and find them adequate to meet many of their initial needs. Those without access to the new workstations may apply to use remote access to a supercomputer in a Center, but often they do not need all the I/O and other capabilities of the large shared facilities. NSF needs a strategy to off-load work not requiring the highest level machines in the Centers. The justification is not economy of scale, but economy of talent and time. When the Lax report was written a 160 Mflop peak Cray 1 was a high performance supercomputer. Within 4 or 5 years workstations delivering up to 400 megaflops costing no more than $15,000 to $20,000 should be widely available. For education and a large fraction of the computational needs of science and engineering, these facilities will be adequate. However, once visualization of computational output becomes routinely required they will be ubiquitously needed. With the rapid pace of improvement, the useful lifetimes of workstations are decreasing rapidly; they often cannot cope with the latest software. Researchers face escalating costs to upgrade their computers. NSF supports perhaps some 20,000 principal investigators. Equipping an additional 10 percent of this number each year (2,000 machines) at $20,000 each requires an incremental $20 million. Recommendation B- 3 addresses how this investment might be managed. Recommendation A-5: We recommend that NSF expand its New Technologies program to support expanded testing of the practicality of new parallel configurations for HPC applications./16 For example, networks of workstations may meet a significant part of midrange HPC science and engineering applications. As progress is made in the development of this and other technologies, experimental use of the new configurations should be encouraged. A significant supplement to HPC applications research capacity can be had with minimal additional cost if such collections of workstations prove practical and efficient. There have already been sufficient experiments with use of distributed file systems and loosely coupled workstations to encourage the belief than many compute-intensive problems are amenable to this approach. For those problems that do not suffer from the latency inherent in this approach the incremental costs can be very low indeed, for the problems run in background and at times the workstations are otherwise unengaged. There are those who strongly believe that in combination with object-oriented programming this approach can create a revolution in software and algorithm sharing as well as more economical machine cycles./17 ---------- 16/Today NSF CISE has a "new technologies" program that co-funds with disciplinary program offices perhaps 50 projects/yr. This program is in the division that funds the Centers, but is focused on projects which can ultimately benefit all users of parallel systems. This program funds perhaps 15 methods and tools projects annually, in addition to those co-funded with science programs. 17/MITRE Corporation, among others, is pursuing this vision. B. RECOMMENDATIONS TO IMPLEMENT THESE GOALS REMOVING BARRIERS TO HPC TECHNICAL PROGRESS AND HPC USAGE Recommendation B-1: To accelerate progress in developing the HPC technology needed by users, NSF should create, in CISE, a challenge program in computer science with grant size and equipment access sufficient to support the systems and algorithm research needed for more rapid progress in HPC. The Supercomputer Centers, in collaboration with hardware and software vendors, can provide test platforms for much of this work. Recommendation A-3 provides the hardware support required for initial development of prototypes. There is consensus that the absence of sufficient funding for systems and algorithms work which is not mission- oriented is the primary barrier to lower cost, more widely accessible, and more usable massively parallel systems. This work, including bringing the most promising ideas to prototype stage for effective transfer to the HPC industry, would address the most significant barriers to the ultimate penetration of parallel architectures in workstations. Advances on the horizon that could be accelerated include more advanced network interface architectures and operating systems technologies to provide low overhead communications in collections of workstations, and advances in algorithms and software for distributed databases of massive size. Computer science has made, and continues to make, important contributions to both hard and soft parallel machine technology, and has effectively transferred these ideas to the industry. Two problems impede the full contribution of computer science to rapid advance in MPP development; grant sizes in the discipline are typically too small to allow enough concentrated effort to build and test prototypes, and too few computer science departments have access to a mid-sized machine on which systems development can be done. The Board should ask for a proposal from CISE to effectively mobilize the best computer science and computational mathematics talent to addressing the solution of these problems in the areas of both improved operating systems, architectures, compilers, and algorithms for existing systems as well as research in next-generation systems. We recommend establishing a number of major projects, with higher levels of annual funding than is typical in Computer Science, and assured duration of up to five years, for a total annual incremental investment of $10 million. We recommend that this challenge fund be managed by CISE, and be accessible to all disciplinary program offices who wish to forward team proposals for add-on funding in response to specific proposals from the community. Recommendation B-2: A significant barrier to rapid progress in the application of HPC lies in formulating a computational strategy to solve a problem. In response to Challenge 1 above, NSF should focus attention, both through CISE and through its disciplinary program offices, on support for the design and development of computational techniques, algorithmic methodology, and mathematical, physical and engineering models to make efficient use of the machines. Without such work in both theoretical and applied areas of numerical analysis, applied mathematics, and computational algorithms, the full benefit of advances in architecture and systems software will not be realized. In particular, significantly increased funding of collaborative and individual state-of-the-art methodology is warranted, and is crucial to the success of high performance computing. Some of this can be done through the individual directorates with funds supplemented by HPCC funds; the Grand Challenge Applications Group awards are a good first step. Recommendation B-3: We recommend NSF set up an agency-wide task force to develop a way to ameliorate the imbalance in the HPC pyramid - the under-investment in the emerging mid-range scalable, parallel computers and the inequality of access to stand-alone (but potentially networked) workstations in the disciplines. This implementation plan should involve a combination of funding by disciplinary program offices and some form of more centralized allocation of NSF resources. Some directorates have "infrastructure" programs; others do not. Still others fund workstations until they reach the "target" set by the HPCC coordination office. We believe that individual disciplinary program managers should consider it their responsibility to fund purchase of workstations out of their equipment funds. But we recognize that these funds need to be supplemented by HPCC funds. CISE has an office which co-funds interdisciplinary applications of HPC workstations. We believe this office may require more budgetary authority than it now enjoys, to ensure the proper balance of program and CISE budgets for workstations. Scientific value must be a primary criterion for resource allocation. It would be unwise to support mediocre projects just because they require supercomputers. The strategy of application approval will depend very heavily on funding scenarios. If sufficient HPCC funds are made available to individual programs for computer usage, then the Supercomputer Centers should be reserved for applications that cannot be carried out elsewhere, with particular priority to novel applications. If individual science programs continue to be underfunded relative to large centers, the Supercomputer Centers may be forced into a role of supporting less novel or demanding computing applications. Under these circumstances, less stringent funding criteria should be applied. C. THE NSF SUPERCOMPUTER CENTERS Recommendation C-1: The Supercomputer Centers should be retained and their missions, as they have evolved since the Lax Report, should be reaffirmed. However, the NSF HPC effort now embraces a variety of institutions and programs - HPC Centers, Engineering Research Centers (ERC) and Science and Technology Centers (STC) devoted to HPC research, and disciplinary investments in computer and computational science and applied mathematics - all of which are essential elements of the HPC effort needed for the next decade. NSF plays a primary but not necessarily dominant role in each of them (see Figure 4 of Appendix D). Furthermore, HPC institutions outside the NSF orbit also contribute to the goals for which the NSF Supercomputer Centers are chartered. Thus we ask the Board to recognize that the overall structure of the HPC program at NSF will have more institutional diversity, more flexibility, and more interdependence with other agencies and private institutions than in the early years of the HPC initiative. We anticipate an evolution, which has already begun, in which the NSF Supercomputer Centers increasingly broaden their base of support, and NSF expands its support in collaboration with other institutional settings for HPC. Center-like groups, especially NSF S&T Centers, are an important instrument for focusing on solving barriers to HPC, although they do not provide HPC resources to users. An excellent example is the multi-institutional Center for Research in Parallel Computation at Rice University, which is supported at about $4M/yr, with additional support from ARPA. Another example is the Center for Computer Graphics and Scientific Visualization, an S&T Center award to University of Utah with participation of University of N. Carolina, Brown, Caltech, and Cornell. Still another example is the Discrete Mathematics and Computational Science Center (DIMACS) at Rutgers and Princeton. These centers fill important roles today, and the ERC and S&T Center structures provide a necessary addition to the Supercomputer Centers for institutionalizing the programmatic work required for HPC. The NSF should continue its current practice of encouraging HPC Center collaboration, both with one another and with other entities engaged in HPC work. The division of the support budget into one component committed to the Supercomputer Centers and another for multi-center activities is a useful management tool, even though it may have the effect of reducing competition among Supercomputer Centers. The National Consortium for HPC (NCHPC), formed by NSF and ARPA is a welcome measure as well. Recommendation C-2: The current situation in HPC is more exciting, more turbulent, and more filled with promise of really big benefits to the nation than at any time since the Lax report; this is not the time to "sunset" a successful, changing venture, of which Supercomputer Centers remain an essential part. Furthermore, we also recommend against an open recompetition of the four Supercomputer Centers at this time, favoring instead periodic performance evaluation and competition for some elements of their activities, both among the Centers themselves and when appropriate with other HPC Centers such as those operated by states (see Recommendation D-1). Continuing evaluation of each Center's performance, as well as the performance of the overall program, is, of course, an essential part of good management of the Supercomputer Centers program. Such evaluations must take place on a regular basis in order to develop a sound basis for adjustments in support levels, to provide incentives for quality performance and to recognize the need to encourage other institutions such as S&T Centers that are attacking HPC barriers and state-based centers with attractive programs in education and training. While recompetition of existing Supercomputer Centers does not appear to be appropriate at this time, if regular review of the Centers and the Centers program identifies shortcomings in a Center or the total program, a recompetition of that element of the program should be initiated. Supercomputer Centers are highly leveraged through investments by industry, vendors, and states. This diversification of support impedes unilateral action by NSF, since the Centers' other sponsors must be consulted before decisions important to the Center are made./18 It also suggests that the issue of recompetition may, in future, become moot as the formal designation "NSF HPC Center" erodes in significance. There is a form of recompetition already in place; the Centers compete for support for new machine acquisition and for roles in multi-center projects. ---------- 18/Each year each center gets a cooperative agreement level which is negotiated. Each center gets about $ 14M; about 15% is flexible. NSF centers have also received help from ARPA to buy new MPP machines. Most of the Centers have important outside sources of support, which imply obligations NSF must respect, such as the Cornell Center relationship with IBM and the San Diego Center's activities with the State of California. Recommendation C-3: The NSF should continue to provide funding to support the Supercomputer Centers' HPC capacity. Any distortion in the uses of the computing pyramid that result from this dedicated funding are best offset by the recommendations we make for other elements in the pyramid. Provision to scientists and engineers of access to leading edge supercomputer resources will continue to be a primary purpose of the Centers, but it is a means to a broader mission; to foster rapid progress in the use of HPC by scientists and engineers, to accelerate progress in usability and economy of HPC and to diffuse HPC capability throughout the technical community, including industry. The following additional components of the Center missions should be affirmed: * Supporting computational science, by research and demonstration in the solution of significant science and engineering problems. * Fostering interdisciplinary collaboration - across sciences and between sciences and computational science and computer science - as in the Grand Challenge programs. * Prototyping and evaluating software, new architectures, and the uses of high speed data communications in collaboration with three groups: computer and computational scientists, disciplinary scientists exploiting HPC resources, the HPC industry, and business firms exploring expanded use of HPC. * Training and education, from post-docs and faculty specialists to introduction of less experienced researchers to HPC methods, to collaboration with state and regional HPC centers working with high schools, community colleges, colleges, and universities. The role of a Supercomputer Center should, therefore, continue to be primarily one of a facilitator, pursuing the goals just listed by making the hardware and human resources available to computational scientists, who themselves are intellectual leaders. In this way the Centers will participate in leadership but will not necessarily be its primary source. With certain notable exceptions, intellectual leadership in computational science has come from scientists around the country who have at times used the resources available at the Centers. This situation is unlikely to change nor should it change. It would be unrealistic to place this type of demand on the Supercomputer Centers and it would certainly not be in the successful tradition of American science. The Supercomputer Centers facilitate interdisciplinary collaborations because they support users from a variety of disciplines, and are aware of their particular strengths. The Centers have been deeply involved in nucleating Grand Challenge teams, and particularly in reaching out to bring computer scientists together with computational scientists. Visualization, for example, is no longer just in the realm of the computational scientist; experimentalists use the same tools for designing and simulating experiments in advance of actual data generation. This common ground should not be separated from the enabling technologies which have made this work possible. Rather high performance computing and the new science it has enabled have seeded advances that would not have happened any other way. ALLOCATION OF CENTER HPC RESOURCES TO INVESTIGATORS Recommendation C-4: The NSF should review the administrative procedures used to allocate Center resources, and the relationship of this process to the initial funding of the research by the disciplinary program offices, to ensure that the burden on scientists applying for research support is minimized, when that research also requires access to the facilities of the Centers, or perhaps access to other elements of the HPC pyramid that will be established pursuant to our recommendations. However we believe the NSF should continue to provide HPC resources to the research community through allocation committees that evaluate competitively proposals for use of Center resources./19 ---------- 19/For NSF funded investigators, allocation committees at Supercomputer Centers should evaluate requests for HPC resources only on the appropriateness of the computational plans, choice of machine, and amount of resource requested. Centers should rely on disciplinary program office determinations of scientific merit, based on their peer review. In this way a two level review of the merits of the science is avoided. A further simplification might be for the application for computer time at the Centers to be included in the original disciplinary proposal, and forwarded to the Centers when the proposal is approved. For non-NSF funded investigators an alternative form of peer review of the research is required. At the present time, the allocation of resources in the Supercomputer Centers for all users is handled by requiring principal investigators to submit annual proposals to a specified Center for access to specific equipment. The NSF should not require a duplicate peer review of the substantive scientific merit of the proposed scientific investigation, first by disciplinary program offices, and then again by the Center Allocation Committees. For this reason, it is proposed that the allocation of supercomputer time be combined with the allocation of research funds to the investigator. Although this panel is not in a position to give administrative details of such a procedure, it is suggested that requests for computer time be attached to the original regular NSF proposal, with (a) experts in computational science included among peer reviewers, or, (b) that portion of the proposal be reviewed in parallel by a peer review established by the Centers. In either case only one set of peer reviewers should evaluate scientific merits, and only one set of reviewers should determine that the research task is being formulated properly for use of HPC resources. Second, we recommend that the Centers collectively establish the review and allocation mechanism, so that while investigators might express a preference for a particular computer or Center for their work, all Centers facilities would be in the pool from which each investigator receives allocations. We recognize, of course, that the specific allocation of machine time often cannot be made at the time of the original proposal for NSF research support, since in some cases the work has not progressed to the point that the mathematical approach, algorithms, etc., are available for Center experts to evaluate and translate into estimates of machine time. Nor is the demand function for facilities known at that time. EDUCATION AND TRAINING Recommendation C-5: The NSF should give strong emphasis to its education mission in HPC, and should actively seek collaboration with state-sponsored and other HPC centers not supported primarily on NSF funding. Supercomputing regional affiliates should be candidates for NSF support, with education as a key role. HPC will also figure in the Administration's industrial extension program, in which the states have the primary operational role. The serious difficulties associated with the use of parallel computers pose a new training burden. In the past it was expected that individual investigators would port their code to new computers and this could usually be done with limited effort. This is no longer the case. The Supercomputer Centers should see their future mission as providing direct aid to the rewriting of code for parallel processors. Computational science is proving to be an effective way to generate new knowledge. As part of its basic mission, NSF needs to teach scientists, engineers, mathematicians, and even computer scientists how high performance computing can be used to produce new scientific results. The role of the Supercomputer Centers is critical to such a mission since the Centers have expertise on existing hardware and software systems, modelling, and algorithms, as well as knowledge of useful high performance computing application packages, awareness of trends in high performance computing and requisite staff. D. NSF AND THE NATIONAL HPC EFFORT; RELATIONSHIPS WITH STATES Recommendation D-1: We recommend that the National Science Board urge OSTP to establish an advisory committee representing the states, HPC users, NSF Supercomputer Centers, computer manufacturers, computer and computational scientists (similar to the Federal Networking Council's Advisory Committee), which should report to HPCCIT. A particularly important role for this body would be to facilitate state-federal planning related to high performance computing. Congress required advisory committee reporting to the PMES, but the committee has not yet been implemented. The committee we propose would provide policy level advice and coordination with the states. The main components of HPCC are networking and HPC, although the networks seem to be receiving priority attention. The Panel believes it is important to continue to emphasize the importance of ensuring adequate compute power in the network to support the National Information Infrastructure applications. We also believe that as participation in HPC continues to broaden through initiatives by the states and by industry, the NSF (and other federal agencies) should encourage their collaboration in the national effort. The Coalition of Academic Supercomputer Centers (CASC) was founded in 1989 to provide a forum to encourage support for high performance computing and networking. Unlike the FCCSET task force, CASC is dependent on others to bring the money to support high performance computing - usually their own State government or university. The result is a valuable discussion group for exchanging information and developing a common agenda and CASC should be encouraged. However, CASC is not a substitute for a more formal federal advisory body. This recommendation is consistent with a recent Carnegie Commission Report entitled "Science, Technology and the States in America's Third Century," which recommends the creation of a system of joint advisory and consultative bodies to foster federal-state exchanges and to create a partnership in policy development, especially for construction of national information infrastructure and provision of services based on it. Because of the importance of high performance computing to future economic development, we need a new balance of cooperation between federal and state government in this area, as in a number of others. Appendix A MEMBERSHIP OF THE BLUE RIBBON PANEL ON HIGH PERFORMANCE COMPUTING Lewis Branscomb, John F. Kennedy School of Government, Harvard University (Chairman) Lewis Branscomb is a physicist, formerly chairman of the National Science Board (1980-1984) and Chief Scientist of IBM Corp. (1972-1986). Theodore Belytschko, Department of Civil Engineering, Northwestern University Ted Belytschkoþs research interests are in computational mechanics, particularly in the modeling of nonlinear problems, such as failure, crashworthiness, and manufacturing processes. Peter R. Bridenbaugh, Executive Vice President - Science, Engineering, Environment, Safety & Health, Aluminum Company of America Peter Bridenbaugh serves on a number of university advisory boards, and is a member of the National Academy of Engineeringþs Industrial Ecology Committee. He also serves on the NSF Task Force 1994 Budget Committee and is a Fellow of ASM International. Theresa Chay, Professor, Department of Biological Sciences, University of Pittsburgh Teresa Chayþs research interests are in modelling biological phenomena such as nonlinear dynamics and chaos theory in excitable cells, cardiac arrhythmias by bifurcation analysis, mathematical modeling for electrical activity of insulin secreting pancreatic B-cells and agonist-induced cytosolic calcium oscillations, and elucidation of the kinetic properties of ion channels. Jeff Dozier, Center for Remote Sensing, University of California, Santa Barbara Jeff Dozier, University of California, Santa Barbara, is a hydrologist and remote sensing specialist. From 1990-1992 he was Senior Project Scientist on NASA's Earth Observing System. Gary Grest, Exxon Corporate Research Science Laboratory Gary Grestþs research interest are in the areas of computational physics and material science, recently emphasizing the modeling the properties of polymers and complex fluids. Edward Hayes, Vice President for Research, Ohio State University Edward F. Hayes is a computational chemist, formerly Controller and Division Director for Chemistry at NSF. Barry Honig, Department of Biochemistry and Molecular Biology, Columbia University Barry Honig's research interests are in theoretical and computational studies of biological macromolecules. He is an associate editor of the Journal of Molecular Biology and is a former president of the Biophysical Society (1990-1991). Neal Lane, Provost, Rice University (resigned from the Panel July 1993) William A. Lester, Jr., Professor and Associate Dean, Department of Chemistry, University of California, Berkeley William A. Lester, Jr., is a theoretical chemist, formerly Director of the National Resource for Computation in Chemistry (1978-81) and Chairman of the NSF Joint Advisory Committees for Advanced Scientific Computing and Networking and Communications Research and Infrastructure (1987). Gregory McRae,Professor, Department of Chemical Engineering, MIT James Sethian, Professor, University of California at Berkeley James Sethian is an applied mathematician in the Mathematics Department at the University of California at Berkeley and in the Physics Division of the Lawrence Berkeley Laboratory. Burton Smith, Tera Computer Company Burton Smith is Chairman and Chief Scientist of Tera Computer Company, a manufacturer of high performance computer systems. Mary Vernon, Department of Computer Science, University of Wisconsin Mary Vernon is a computer scientist who has received the NSF Presidential Young Investigator Award and the NSF Faculty Award for Women Scientists and Engineers in recognition of her research in parallel computer architectures and their performance. Appendix B NSF AND HIGH PERFORMANCE COMPUTING: HISTORY AND ORIGIN OF THIS STUDY Introduction This report of the Blue Ribbon Panel on High Performance Computing follows a number of separate, but related, activities in this area by the NSF, the computational science community, and the Federal Government in general acting in concert through the Federal Coordinating Committee on Science, Engineering, and Technology. The Panel's findings and recommendations must be viewed within this broad context of HPC. This section provides a description of the way in which the panel has conducted its work and a brief overview of the preceding accomplishments which were used as the starting point for the Panel's deliberations. The Origin of the Present Panel and Charter Following the renewal of four of the five NSF Supercomputer Centers in 1990, the National Science Board (NSB) maintained an interest in the Centers' operations and activities. Given the national scope of the Centers, and the possible implications for them contained in the HPCC Act of 1992, the NSB commissioned the formation of a blue ribbon panel to investigate the future changes in the overall scientific environment due the rapid advances occurring in the field of computers and scientific computing. The panel was instructed to investigate the way science will be practiced in the next decade, and recommend an appropriate role for NSF to enable research in the overall computing environment of the future. The panel consists of representatives from a wide spectrum of the computer and computational science communities in industry and academia. The role expected of the Panel is reflected by its Charter : A. Assess the contributions of high performance computing to scientific and engineering research and education, including ancillary benefits, such as the stimulus to the pace of innovation in U.S. industries and the public sector. B. Project what hardware, software and communication resources may be available in the next five to ten years to further these advances and identify elements that may be particularly important to the development of HPC. C. Assess the variety of institutional forms through which access to high performance computing may be gained including funding of equipment acquisition, shared access through local centers, and shared access through broad band telecommunications. D. Project sources, other than NSF, for support of such capabilities, and potential cooperative relationships with: states, private sector, other federal agencies, and international programs. E. Identify barriers to the development of more efficient, usable, and powerful means for applying high performance computing, and means for overcoming them. F. Provide recommendations to help guide the development of NSFþs participation in supercomputing and its relation to the federal interagency High Performance Computing and Communications Program. G. Recommend policies and managerial structures needed to achieve NSF program goals, including clarification of the peer review procedures and suggesting appropriate processes and mechanisms to assess program effectiveness necessary for insuring the highest quality science and engineering research. At its first meeting in January 1993, the panel approved its Charter, and established a scope of work which would allow a final report to be presented to the NSB in Summer 1993. A large number of questions were raised amplifying the Charter's directions. Prior to its second meeting in March 1993 the Panel solicited input from the national research community; a response to the following four questions was requested. * How would you project the emerging high performance computing environment and market forces over the next five years and the implications for change in the way scientists and engineers will conduct R&D, design and production modeling? * What do you see as the largest barriers to the effective use of these emergent technologies by scientists and engineers and what efforts will be needed to remove these barriers? What is the proper role of government, and, in particular, the NSF to foster progress? * To what extent do you believe there is a future role for government-supported supercomputer centers? What role should NSF play in this spectrum of capabilities? * To what extent should NSF use its resources to encourage use of high performance computing in commercial industrial applications through collaboration between high performance computing centers, academic users and industrial groups? Over fifty responses were received and were considered and discussed by the Panel at its March meeting. The Panel also received presentations, based on these questions, from vendors of high performance computing equipment and representatives from non-NSF supercomputer centers. NSF's Early Participation in High Performance Computing Although the National Science Foundation is now a major partner in the nation's high performance computing effort, this was not always the case. In the early 1970s the NSF ceased its support of campus computing centers, and by the mid-1970s there were no "supercomputers" on any campus available to the academic community. Certainly computers of this capability were available through other government agency (DoE and NASA) laboratories, but NSF did not play a role, and hence many of its academic researchers did not have the ability to perform computational research on anything other than a departmental minicomputer, thereby limiting the scope of their research. This lack of NSF participation in the high performance computing environment began to be noted in the early 1980s with the publication of a growing number of reports on the subject. A report to the NSF Division of Physics Advisory Committee in March 1981 entitled "Prospectus for Computational Physics", edited by W. Press, identified a "crisis" in computational physics, and recommended support for facilities. Subsequent to this report a joint agency study, "Large Scale Computing in Science and Engineering", edited by P. Lax, appeared in December 1982 and acted as the catalyst for NSF's reemergence in the support of high performance computing. The Lax Report presented four recommendations for a government-wide program: * Increased access to regularly upgraded supercomputing facilities via high bandwidth networks * Increased research in computational mathematics, software, and algorithms * Training of personnel in scientific computing * R&D of new supercomputer systems The key suggestions contained in the Lax Report were studied by an internal NSF working group, and the findings were issued in July 1983 as "A National Computing Environment for Academic Research", a report edited by M. Bardon and K. Curtis. The report studied NSF supported scientists' needs for academic computing, and validated the conclusions of the Lax Report for the NSF supported research community. The findings of Bardon/Curtis reformulated the four recommendations of the Lax Report into a six point implementation plan for the NSF. Part of this action plan was a recommendation to establish ten academic supercomputer centers. The immediate NSF response was to set up a means for academic researchers to have access, at existing sites, to the most powerful computers of the day. This was an interim step prior to a solicitation for the formation of academic supercomputer centers directly supported by the NSF. By 1987, five NSF Supercomputer Centers had been established, and all had completed at least one year of operation. During this phase the Centers were essentially isolated "islands of supercomputing" whose role was to provide supercomputer access to the academic community. This aspect of the Centers' activities has changed considerably. The NSF concept of the Centers' activities was mandated to be much broader, as indicated by the Center's original objectives: * Access to state of the art supercomputers * Training of computational scientists and engineers * Stimulate the U.S. supercomputer industry * Nurture computational science and engineering * Encourage collaboration among researchers in academia, industry and government In 1988-1989 NSF conducted a review to determine whether support was justified beyond 1990. In developing proposals, the Centers were advised to increase their scope of responsibilities. Quoting from the solicitation: "To insure the long term health and value of a supercomputer center, an intellectual environment, as well as first class service, is necessary. Centers should identify an intellectual component and research agenda". In 1989 NSF approved continuation through 1995 of the Cornell Theory Center, the National Center for Supercomputing Applications, the Pittsburgh Supercomputing Center, and the San Diego Supercomputer Center. Support for the John von Neumann Center was not continued. The Federal High Performance Computing and Communications Initiative At the same time the NSF Supercomputer Centers were beginning the early phases of their operations the Federal Coordinating Committee for Science, Engineering, and Technology began a study in 1987 on the status and direction of high performance computing, and its relationship to federal research and development. The results were "A Research and Development Strategy for High Performance Computing" issued by the Office of Science and Technology Policy (OSTP) in November 1987, followed in September 1989 by another OSTP document "The Federal High Performance Computing Program". These two reports set the framework for the inter-governmental agency cooperation on high performance computing which led to the High Performance Computing and Communications (HPCC) Act of 1991. HPCC focuses on four integrated components+ of computer research and applications which very closely echo the Lax Report conclusions: ---------- +At the time of writing this Report, a fifth component, entitled, Information Infrastructure, Technology, and Applications is being defined for inclusion in the HPCC Program. * High Performance Computing Systems - technology development for scalable parallel systems to achieve teraflop speed * Advanced Software Technology and Algorithms - generic software and algorithm development to support Grand Challenge projects, including early access to production scalable systems * National Research and Education Network - to further develop the national network and networking tools, and to support the research and development of gigabit networks * Basic Research and Human Resources - to support individual investigator research on fundamental and novel science and to initiate activities to significantly increase the pool of trained personnel With this common structure across all the participating agencies, the Program outlines each agency's roles and responsibilities. NSF is the lead agency in the National Research and Education Network, and has major roles in Advanced Software Technology and Algorithms, and in Basic Research and Human Resources. The Sugar Report After the renewal of the four NSF Supercomputer Centers the NSF Division of Advanced Scientific Computing recognized that the computing environment within the nation had changed considerably from that which existed at the inception of the Centers Program. The Division's Advisory Committee was asked to survey the future possibilities for high performance computing, and report back to the Division. Two workshops were held in the Fall of 1991 and Spring of 1992. Thirty one participants with expertise in computational science, computer science and the operation of major supercomputer centers were involved. The final report, edited by R. Sugar of the U. of California at Santa Barbara, recommended future directions for the Supercomputer Centers Program which would "enable it to take advantage of these (HPCC) opportunities and to meet its responsibilities to the national research community". The committee's recommendations can be summarized as: * Decisions and planning by the Division need to be made in a programmatic way, rather than on an individual Center by Center basis - the meta-center concept provides a vehicle for this management capability which goes beyond the existing Centers. * Access to stable computing platforms (currently vector supercomputers) needs to be augmented by access to state of the art technology (currently massively parallel computers) - but, the former cannot be sacrificed to provide the latter * The Supercomputer Centers can be focal points for enabling collaborative efforts across many communities - computational and computer science, private sector and academia, vendors and academia. Appendix C TECHNOLOGY TRENDS and BARRIERS to FURTHER PROGRESS BACKGROUND: What is the state of the HPC industry here and abroad? What is its prognosis? The high performance computer industry is in a state of turmoil, excitement and opportunity. On the one hand, the vector multiprocessors manufactured by many firms, large and small, have continued to improve in capability over the years. These systems are now quite mature, as measured by the fact that delivered performance is a significant fraction of the theoretical peak performance of the hardware, and are still the preferred platform for many computational scientists and engineers. They are the workhorses of high performance computing today and will continue in that role even as alternatives mature. On the other hand, dramatic improvements in microprocessor performance and advances in microprocessor-based parallel architectures have resulted in "massively parallel" systems that offer the potential for high performance at lower cost./20 For example, $10 million in 1993 buys over 40 gigaflops peak processing power in a multicomputer but only 5 gigaflops in a vector multiprocessor. As a result, increasing numbers of computational scientists and engineers are turning to the highly parallel systems manufactured by companies such as Cray Research Inc., IBM, Intel, Kendall Square, Thinking Machines Inc., MasPar, and nCUBE. ---------- 20/Note that the higher cost of vector machines is partly caused by their extensive use of static memory chips for main memory and the interconnection networks they use for high shared-memory bandwidth. These attributes contribute to increased programmability and the realization of a high fraction of peak performance on user applications. Realized performance on MPP machines is still uncertain. A comparison of today's vector machines versus MPP systems based on realized performance per dollar reveals much less difference in cost-performance than comparisons based on peak performance. Microprocessor performance has increased by 4X every three years, matching the rate of integrated circuit logic density improvement as predicted by Moore's law. For example, the microprocessors of 1993 are around 200 times faster than those of 1981. By contrast, the clock rates of vector processors have improved much more slowly; today's fastest vector processors are only five or six times faster than 1976's Cray 1. Thus, the performance gap between these two technologies is quickly disappearing in spite of other performance improvements in vector processor architecture. Although microprocessor-based massively parallel systems hold considerable promise for the future, they have not yet reached maturity in terms of ease of programming and ability to deliver high performance routinely to large classes of applications. Unfortunately, the programming technology that has evolved for the vector multiprocessors does not directly transfer to highly parallel systems. New mechanisms must be devised for high performance communication and coordination among the processors. These mechanisms must be efficiently supported in the hardware and effectively embodied in programming models. Currently, vendors are providing a variety of systems based on different approaches, each of which has the potential to evolve into the method of choice. Vector multiprocessors support a simple shared memory model which demands no particular attention to data arrangement in memory. Many of the currently available highly parallel architectures are based on the "multicomputer" architecture which provides only a message-passing interface for inter-processor communication. Emerging architectures, including the Kendall Square KSR-1 and systems being developed by Convex, Cray Research, and Silicon Graphics, have shared address spaces with varying degrees of hardware support and different refinements of the shared memory programming model. These computers represent a compromise in that they offer much of the programming simplicity of shared memory yet still (at least so far) require careful data arrangement to achieve good performance. (The data parallel language on the CM-5 has similar properties.) A true shared memory parallel architecture, based on mechanisms that hide memory access latency, is under development at Tera Computer. The size of the high performance computer market worldwide is about $2 billion (excluding sales of the IBM add-on vector hardware), with Cray Research accounting for roughly $800 million of it. IBM and Fujitsu are also significant contributors to this total, but most companies engaged in this business have sales of $100 million or less. Some companies engaged in high speed computing have other, larger sources of revenue (IBM, Fujitsu, Intel, NEC, Hitachi); other companies both large (Cray Research) and small (Thinking Machines, Kendall Square, Meiko, Tera Computer) are high performance computer manufacturers exclusively. There are certainly more companies in the business than can possibly be successful, and no doubt there are new competitors that will appear. Helping to sustain this high level of competitive innovation should be an important objective for NSB policy in HPC. FINDINGS Where is the hardware going to be in 5 years? What will be the performance and cost of the most powerful machines, the best workstations, the mid-range computers? The next five years will continue to see improvements in hardware price/performance ratios. Since microprocessor speeds now closely approach those of vector processors, it is unclear whether microprocessor performance improvement can maintain its current pace. Still, as long as integrated circuits continue to quadruple in density but only double in cost every three years we can probably expect a fourfold price/performance improvement in both processors and memory by 1998. Estimating in constant 1993 dollars, the most powerful machines ($50 million) will have peak performance of nearly a teraflop/21; mini-supercomputers ($1 million) will advertise 20 gigaflops peak performance; workstations ($50,000) will approach 1 gigaflops, and personal computers ($10,000) will approach 200 megaflops./22 ---------- 21/One teraflop is 1000 gigaflops or 10[superscript 12] floating point instructions per second. 22/Spokesman from Intel, Convex and Silicon Graphics in addressing the panel all made even higher estimates that this. During this period, parallel architectures will continue to emerge and evolve. Just as the CM-5 represented a convergence between SIMD and MIMD parallel architectures and brought about a generalization of the data-parallel programming model, it is likely that the architectures will continue to converge and better user-level programming models will continue to emerge. These developments will improve software portability and reduce the variety of architectures that are required for computational science and engineering research, although there will likely still be some diversity of approaches at the end of this 5-year horizon. Questions that may be resolved by 1998 include: * Which varieties of shared memory architecture provide the most effective tradeoff between hardware simplicity, system performance, and programming convenience? and * What special synchronization mechanisms for processor coordination should be supported in the hardware? Most current systems are evolving in these directions, and answers to the issues will provide a more stable base for software efforts. Furthermore, much of the current computer science research in shared memory architectures is looking for cost-effective hardware support that can be implemented in multiprocessor workstations that are interconnected by general- purpose local area networks. Thus, technology from high performance parallel systems may be expected to migrate to workstation networks, further improving the capabilities of these systems to deliver high-performance computing to particular applications. It is possible that in the end the only substantial difference between the supercomputers of tomorrow and the workstation networks of tomorrow will be the installed network bandwidth./23 ---------- 23/While parallel architectures mature, vector multiprocessors will continue to evolve. Scaling to larger numbers of processors ultimately involves solving the same issues as for the microprocessor-based systems. Where is the software/programmability going to be in 5 years? What new programming models will emerge for the new technology? How transparent will parallel computers be to users? While the architectural issues are being resolved, parallel languages and their compilers will need to continue to improve the programmability of new high performance computer systems. Implementations of "data parallel" language dialects like High Performance Fortran, Fortran D, and High Performance C will steadily improve in quality over the next five years and will simplify programming of both multi-computers and shared address systems for many applications. For the applications that are not helped by these languages, new languages and programming models will emerge, although at a slower pace. Despite strong efforts addressing the problem from the language research community, the general purpose parallel programming language is an elusive and difficult quarry, especially if the existing Fortran software base must be accommodated, because of difficulties with the correct and efficient use of shared variables. Support tools for software development have also been making progress, with emphasis on visualization of a program's communication and synchronization behavior. Vendors are increasingly recognizing the need for sophisticated performance tuning tools, with most now developing or beginning to develop such tools for their machines. The increasing number of computer scientists who are also using these tools could lead to even more rapid improvement in the quality and usability of these support tools. Operating systems for high performance computers are increasingly ill suited to the demands placed on them. Virtualization of processors and memory often leads to poor performance, whereas relatively fixed resource partitioning produces inefficiency, especially when parallelism within the application varies. High performance I/O is another area of shortfall in many systems, especially the multi-computers. Research is needed in nearly every aspect of operating systems for highly parallel computers. What market forces or technology investments drive HPC technologies and products? Future high performance systems will continue to be built using technologies and components built for the rest of the computer industry. Since integrated circuit fabrication facilities now represent billion dollar capital investments, integrated circuits benefit from very large scale economies; accordingly it has been predicted that only mass-market microprocessors will prove to have acceptable costs in future high performance systems. Certainly current use of workstation microprocessors such as Sparc, Alpha and the RS-6000 chips suggests this trend. Even so the cost of memory chips is likely to be a major factor in the costs of massively parallel systems, which require massive amounts of fast memory. Thus the integrated circuit technology available for both custom designs and industry standard processors will increasingly be driven by the requirements of much larger markets, including consumer electronics. The health of the HPC vendors and the structure of their products will be heavily influenced by demand from industrial customers. Business applications represents the most rapidly growing market for HPC products; they have much higher potential growth than government or academic uses. Quite apart from NSF's obligation to contribute to the nation's economic health through its research activities, this fact motivates the importance of cooperation with industry users in expanding HPC usage. This reality means that NSF should be attentive to the value of throughput as a figure of merit in HPC systems (in contrast with turnaround time which academic researchers usually favor), as well as the speed with which large volumes of data can be accessed. Industry won't put up with a stand-alone, idiosyncratic environment. How practical will be the loose coupling of desk-top workstations to aggregate their unused compute power? Networks of workstations will become an important resource for the many computations that perform well on them. The probable success of these loosely coupled system will inevitably raise the standard for communication capabilities in the multicomputer arena. Many observers believe that competition from workstation networks on one side and shared address space systems on the other will drive multi-computers from the scene entirely; in any event, the network bandwidth and latency of multi-computers must improve to differentiate them from workstation networks. Many large institutions have 1000 or more workstations already installed; the utilization rate of their processors on a 24 hour basis is probably only a few percent. An efficient way to use the power of such heterogeneous networks would be more financially attractive. It will, however, raise serious question about security, control, virus-prevention, and accounting programs. Are there some emerging HPC technologies of interest other than parallel processing? What is their significance? Neural networks have recently become popular and have been successfully applied to many pattern recognition and classification problems. Fuzzy logic has enjoyed an analogous renaissance. Technologies of this sort are both interesting and important in a broad engineering context and also are having impact on computational science and engineering. Machine learning approaches, such as neural networks, are most appropriate in scientific disciplines where there is insufficient theory to support accurate computer modeling and simulation. How important are simulation and visualization capabilities? Simulation will play an ever increasing role in science and engineering. Much of this work will be able to be carried out on workstations or intermediate-scale systems, but it will continue to be appropriate to share the highest performance systems (and the expertise in using them) on a national scale, to accomplish large simulations within human time scales. Smaller configurations of these machines should be provided to individual research universities for application software development and research that involves modifying the operating system and/or hardware. Personal computer capabilities will improve, and visualization on the desktop will become more routine. Scientists and engineers in increasing numbers will need to be equipped with visualization capabilities. The usefulness of high performance computing relies on these systems because printed lists of numbers (or printed sheaves of pictures, for that matter) are increasingly unsatisfactory as an output medium, even for moderately sized simulations. BARRIERS TO CONTINUED RAPID PROGRESS What software and/or hardware "inventions" are needed? Who will address meeting these needs? The most important impediment to the use of new highly parallel systems has been the difficulty of programming these machines and the wide variation that exists in communication capabilities across generations of machines as well as among the machines in a given generation. Application software developers are understandably reluctant to re-implement their large scale production codes on multi-computers, when significant effort is required to port the codes across parallel systems as they evolve. In theory, any programming model can be implemented (by appropriate compilers) on any machine. However, the inefficiency of certain models on certain architectures is so great as to render them impractical./24 What is needed in high performance computing is an architectural consensus and a simple model to summarize and abstract the machine interface to allow compilers to be ported more easily across systems, facilitating the portability of application programs. Ideally, the consensus interface should efficiently support existing programming models (even the multi-computers have created their own dusty decks), as well as more powerful models. Considerable research in the computer science community is currently devoted to these issues. It is unlikely that the diversity of programming models will decrease within the next five years, but it is likely that models will become more portable. ---------- 24/For example, it is not practical to implement data-parallel compilers on the Intel iPSC/860. How important will be access to data, data management? Besides needing high performance I/O, some fields of computational science need widely distributed access to data bases that are extremely large and constantly growing. The need is particularly felt in the earth and planetary sciences, although the requirements are also great in cellular biology, high energy physics, and other disciplines. Large scale storage hierarchies and the software to manage them must be developed, and means to distribute the data nationally and internationally are also required. Although this area of high performance computing has been relatively neglected in the past, these problems are now receiving significantly more attention. ROLES FOR GOVERNMENT AGENCIES What should government agencies (NSF, DoD, DoE) do to advance HPC beyond today's state of the art? What more might they be doing? The National Science Foundation plays several critical roles in advancing high performance computing. First, NSF's support of basic research and human resources in all areas of science and engineering (and particularly in mathematics, computer science and engineering) has been responsible for many of the advances in our ability to successfully tackle grand challenge problems. The Supercomputer Centers and the NSFnet have been essential to the growth of high performance computing as a basic paradigm in science and engineering. These efforts have been successful and should be continued. However, NSF has done too little in supporting computational engineering in the computer science community. For example, the NSF Supercomputer Centers were slow in providing experimental parallel computing facilities and are currently not responding adequately to integrating emerging technologies from the computer science community. Although this situation is gradually changing, the pace of the change should be accelerated. Many advances in high performance computer systems have been funded and encouraged by the Advanced Research Projects Agency (ARPA), the major supporter of large scale projects in computer science and engineering research and development in the US. ARPA has been charged by Congress to champion "dual use" technology; in so doing it is addressing many of the needs of computational science and engineering, even in the mathematical software arena, that are common to defense and commercial applications, and the science that underlies both. The Department of Energy has traditionally provided substantial support to computer science and engineering research within its national laboratories and at universities with strong impetus being provided by national defense requirements and resources. More recently, the focus has shifted to the high performance computing and communications needs of the unclassified Energy Research programs within DoE. The National Energy Research Supercomputer Center (NERSC) and the Energy Sciences Network (ESnet) provide production services similar to the NSF supercomputer centers and the NSFnet. Under the DoE HPCC component, "grand challenge" applications are supported at NERSC and also at two High Performance Computing Research Centers (HPCRCs) which offer selected access for grand challenge applications to leading edge parallel computing machines. DoE also sponsors a variety of graduate fellowships in the computational sciences. The computational science infrastructure and traditions of DoE remain sound; however, the ability of the Department to advance the state-of-the-art in high performance computing systems will be paced by its share of the funding available through the Federal High Performance Computing Initiative or through Defense conversion funds. The Department of Commerce has not been a significant source of funds for computer system research and development since the very early days of the computer industry when the National Bureau of Standards built one of the first digital computers. NBS has been an important factor in supporting standards development, particularly for the Federal Information Processing Standards issued by GSA. The expanded role of the National Institute for Standards and Technology (as NBS is now called) under the Clinton administration may include this kind of activity, especially when industrial participation is a desired component. NASA is embarked on a number of projects of potential importance, especially in the development of a shared data system for the global climate change program, which will generate massive amounts of data from the Earth Observing Satellite Program. What is role of NSF computer science and applied mathematics research program? Is it relevant to the availability of HPC resources in a five year time span? Investments in mathematics and computer science research provide the foundation for attacking today's problems in high performance computing and must continue. NSF continues to be the primary U.S. source of funds for mathematics and computer science research within the scope of what one or two investigators and several graduate assistants can do. Many fundamental advances in algorithms, programming languages, operating sy