This document has been archived

Title  : NSB 93-205 -- NSF Blue Ribbon Panel on High Performance Computing
Type   : Report
NSF Org: CISE
Date   : October 19, 1993
File   : nsb93205


                              From
                             Desktop
                          To Teraflop:
                    Exploiting the U.S. Lead
                               in
                   High Performance Computing


       NSF Blue Ribbon Panel on High Performance Computing

                           August 1993


Lewis Branscomb (Chairman)
Theodore Belytschko
Peter Bridenbaugh
Teresa Chay
Jeff Dozier
Gary S. Grest
Edward F. Hayes
Barry Honig
Neal Lane (resigned from Panel July 1993)
William A. Lester, Jr.
Gregory J. McRae
James A. Sethian
Burton Smith
Mary Vernon


"It is easier to invent the future than to predict it" - Alan Kay


Dedication

This report is dedicated to one of the nationţs most
distinguished computer scientists, a builder of important
academic institutions, and a devoted and effective public
servant; Professor Nico Habermann.  Dr. Habermann took
responsibility in organizing this Panel's work and saw it through
to completion, but passed away just a few days before it was
presented to the National Science Board.  The members of the
panel deeply feel the loss of his creativity, wisdom, and
friendship.

                        EXECUTIVE SUMMARY

An Introductory Remark:  Many reports are prepared for the
National Science Board and the National Science Foundation that
make an eloquent case for more resources for one discipline or
another. This is not such a report. This report addresses an
opportunity to accelerate progress in virtually every branch of
science and engineering concurrently, while also giving a shot in
the arm to the entire American economy as business firms also
learn to exploit these new capabilities.  The way much of science
and engineering are practiced will be transformed, if our
recommendations are implemented.

The National Science Board can take pride in the Foundation's
accomplishments in the decade since it implemented the
recommendations of the Peter Lax Report on high performance
computing (HPC). The Foundation's High Performance Computing
Centers continue to play a central role in this successful
strategy, creating an enthusiastic and demanding set of
sophisticated users, who have acquired the specialized
computational skills required to use the fast advancing but still
immature high performance computing technology.

Stimulated by this growing user community, the HPC industry finds
itself in a state of excitement and transition.  The very success
of the NSF program, together with those of sister agencies, has
given rise to a growing variety of new experimental computing
environments, from massively parallel systems to networks of
coupled workstations, that could, with the right research
investments, produce entirely new levels of computing power,
economy, and usability.  The U.S. enjoys a substantial lead in
computational science and in the emerging technology; it is
urgent that the NSF capitalize on this lead, which not only
offers scientific preeminence but also the industrial lead in a
growing world market.

The vision of the rapid advances in both science and technology
that the new generation of supercomputers could make possible has
been shown to be realistic.  This very success, measured in terms
of new discoveries, the thousands of researchers and engineers
who have gained experience in HPC, and the extraordinary
technical progress in realizing new computing environments,
creates its own challenges. We invite the Board to consider four
such challenges:


Challenge 1: How can NSF, as the nation's premier agency funding
basic research, remove existing barriers to the rapid evolution
of high performance computing, making it truly usable by all the
nation's scientists and engineers?  These barriers are of two
kinds: technological barriers (primarily to realizing the promise
of highly parallel machines, workstations, and networks) and
implementation barriers (new mathematical methods and new ways to
formulate science and engineering problems for efficient and
effective computation). An aggressive commitment by NSF to
leadership in research and prototype development, in both
computer science and in computational science, will be required.

Challenge 2: How can NSF provide scalable access to a pyramid of
computing resources, from the high performance workstations
needed by most scientists to the critically needed teraflop-and-
beyond capability required for solving Grand Challenge problems?
What balance of among high performance desktop workstations, vs.
mid-range or mini-supercomputer, vs. networks of workstations,
vs. remote, shared supercomputers of very high performance should
NSF anticipate and encourage?

Challenge 3: The third challenge is to encourage the continued
broadening of the base of participation in HPC, both in terms of
institutions and in terms of skill levels and disciplines.  This
calls for expanded education and training, and participation by
state-based and other HPC institutions.

Challenge 4: How can NSF best create the intellectual and
management leadership for the future of high performance
computing in the U.S.?  What role should NSF play within the
scope of the nationally coordinated HPCC program? What
relationships should NSF's activities in HPC have to the
activities of other federal agencies?

This report recommends significant expansion in NSF investments,
both in accelerating progress in high performance computing
through computer and computational science research and in
providing the balanced pyramid of computing facilities to the
science and engineering communities.  The cost estimates are only
approximate, but in total they do not exceed the Administration's
stated intent to double the investments in HPCC during the next 5
years.  We believe these investments are not only justified but
are compatible with stated national plans, both in absolute
amount and in their distribution.


                        RECOMMENDATIONS:

We have four sets of interdependent recommendations.  The first
implements a balanced pyramid of computing environments (see
Figure A following this Summary). Each element in the pyramid
supports the others; whatever resources are applied to the whole,
the balance in the pyramid should be sustained.  The second set
addresses the essential research investments and other steps to
remove the obstacles to realizing the technologies in the pyramid
and the barriers to the effective use of these environments.

The third set addresses the institutional structure for delivery
of HPC capabilities, and consists itself of a pyramid (see Figure
B following this Summary), of which the NSF Centers are an
important part.  At the base of the institutional pyramid is the
diverse array of investigators in their universities and other
settings, who use all the facilities at all levels of the
pyramid. At the next level are departments and research groups
devoted to specific areas of computer science or computational
science and engineering. At the next level are the NSF HPC
Centers, which must continue to be providers of shared high
capability computing systems and to provide aggregations of
specialized capability for all aspects of use and advance of high
performance computing. At the apex is the national teraflop-class
facility, which we recommend as a multi-agency facility pushing
the frontiers of high performance into the next decade.

A final recommendation addresses the NSF's role at the national
level and its relationship with the states in HPC.

A. CENTRAL GOAL FOR NSF HPC POLICY

Recommendation A-1: The National Science Board should take the
lead, under OSTP guidance and in collaboration with ARPA, DoE and
other agencies, to expand access to all levels of the dynamically
evolving pyramid of high performance computing capability for all
sectors of the whole nation. The realization of this pyramid
depends, of course, on rapid progress in the pyramid's
technologies. The computational capability we envision includes
not only the research capability for which NSF has special
stewardship, but also includes a rapid expansion of capability in
business and industry to use HPC profitably, and many operational
uses of HPC in commercial and military activities.

VISION OF THE HPC PYRAMID

Recommendation A-2: At the apex of the pyramid is the need for a
national capability at the highest level of computing power the
industry can support with both efficient software and hardware. A
reasonable goal would be the design, development, and realization
of a national teraflop-class capability, subject to the
successful development of software and computational tools for
such a large machine (recommendation B-1).  NSF should initiate,
through OSTP, an interagency plan to make this investment,
anticipating multi-agency funding and usage.

Recommendation A-3: Over a period of 5 years the research
universities should be assisted to acquire mid-range machines.
These mid-sized machines are the underfunded element of the
pyramid today -- about 10% of NSF's FY92 HPC budget is devoted to
their acquisition. They are needed for both demanding science and
engineering problems that do not require the very maximum in
computing capacity, and for use by the computer science and
computational mathematics community in addressing the
architectural, software, and algorithmic issues that are the
primary barriers to progress with massively parallel processor
architectures.

Recommendation A-4:  We recommend that NSF double the current
annual level of investment ($22 million) providing scientific and
engineering workstations to its 20,000 principal investigators.
Within 4 or 5 years workstations delivering up to 400 megaflops
costing no more than $15,000 to $20,000 should be widely
available. For education and a large fraction of the
computational needs of science and engineering, these facilities
will be adequate.

Recommendation A-5: We recommend that the NSF expand its New
Technologies program to support expanded testing of the new
parallel configurations for HPC applications.  For example, the
use of Gigabit local area networks to link workstations may meet
a significant segment of mid-range HPC science and engineering
applications.  A significant supplement to HPC applications
research capacity can be had with minimal additional cost if such
collections of workstations prove practical and efficient.


           B. RECOMMENDATIONS TO IMPLEMENT THESE GOALS

REMOVING BARRIERS TO HPC TECHNICAL PROGRESS AND HPC USAGE

Recommendation B-1: To accelerate progress in developing the HPC
technology needed by users, NSF should create, in the Directorate
for Computer and Information Science and Engineering, a challenge
program in computer science with grant size and equipment access
sufficient to support the systems and algorithm research needed
for more rapid progress in HPC capability. The Centers, in
collaboration with hardware and software vendors, can provide
test platforms for much of this work, and recommendation A-3
provides the hardware support required for initial development of
prototypes.

Recommendation B-2: A significant barrier to rapid progress in
HPC application lies in the formulation of the computational
strategy for solving a scientific or engineering problem. In
response to Challenge 1, the NSF should focus attention, both
through CISE and through its disciplinary program offices, on
support for the design and development of computational
techniques, algorithmic methodology, and mathematical, physical
and engineering models to make efficient use of the machines.

BALANCING THE PYRAMID OF HPC ACCESS

Recommendation B-3: We recommend NSF set up a task force to
develop a way to ameliorate the imbalance in the HPC "pyramid" --
the under-investment in the emerging mid-range scalable, parallel
computers and the inequality of access to stand-alone (but
potentially networked) workstations in the disciplines. This
implementation plan should involve a combination of funding by
disciplinary program offices and some form of more centralized
allocation of NSF resources.

                      C. THE NSF HPC CENTERS

Recommendation C-1 : The Centers should be retained and their
missions should be reaffirmed. However, the NSF HPC effort now
embraces a variety of institutions and programs -- HPC Centers,
Engineering Research Centers, and Science & Technology Centers
devoted to HPC research, and disciplinary investments in computer
and computational science and applied mathematics -- all of which
are essential elements of the HPC effort needed for the next
decade.  Furthermore, HPC institutions outside the NSF orbit also
contribute to the goals for which the NSF Centers are chartered.
Thus we ask the Board to recognize that the overall structure of
the HPC program at NSF will have more institutional diversity,
more flexibility, and more interdependence with other agencies
and private institutions than was possible in the early years of
the HPC initiative.

     The NSF should continue its current practice of encouraging
     HPC Center collaboration, both with one another and with
     other entities engaged in HPC work. The division of the
     support budget into one component committed to the centers
     and another for multi-center activities is a useful
     management tool, even though it may have the effect of
     reducing competition among centers.  The National Consortium
     for HPC (NCHPC), formed by NSF and ARPA is a welcome measure
     as well.

Recommendation C-2 :  The current situation in HPC is both more
exciting, more turbulent, and more filled with promise of really
big benefits to the nation than at any time since the Lax report;
this is not the time to "sunset" a successful, changing venture,
of which the Centers remain an important part. Furthermore, we
also recommend against re-competition of the four Centers at this
time, favoring periodic performance evaluation and competition
for some elements of their activities, both among Centers and
when appropriate with other HPC Centers such as those operated by
states (see Recommendation D-1).

Recommendation C-3 :  The mission of the Centers is to foster
rapid progress in the use of HPC by scientists and engineers, to
accelerate progress in usability and economy of HPC and to
diffuse HPC capability throughout the technical community,
including industry. Provision to scientists and engineers of
access to leading edge supercomputer resources will contine to be
a primary purpose of the Centers.  The following additional
components of the Center missions should be affirmed:

          * Supporting computational science, by research and
          demonstration in the solution of significant science
          and engineering problems.

          * Fostering interdisciplinary collaboration -- across
          sciences and between sciences and computational science
          and computer science -- as in the Grand Challenge
          programs.

          * Prototyping and evaluating software, new
          architectures, and the uses of high speed data
          communications in collaboration with: computer and
          computational scientists, disciplinary scientists
          exploiting HPC resources, the HPC industry, and
          business firms exploring expanded use of HPC.

          * Training and education, from post-docs and faculty
          specialists to introduction of less experienced
          researchers to HPC methods, to collaboration with state
          and regional HPC centers working with high schools and
          community colleges.

ALLOCATION OF CENTER HPC RESOURCES TO INVESTIGATORS

Recommendation C-4: The NSF should continue to monitor the
administrative procedures used to allocate Center resources, and
the relationship of this process to the initial funding of the
research by the disciplinary program offices, to ensure that the
burden on scientists applying for research support is minimized.
NSF should continue to provide HPC resources to the research
community through allocation committees that evaluate
competitively proposals for use of Center resources.

EDUCATION AND TRAINING

Recommendation C-5: The NSF should give strong emphasis to its
education mission in HPC, and should actively seek collaboration
with state-sponsored and other HPC centers not supported
primarily on NSF funding. Supercomputing regional affiliates
should be candidates for NSF support, with education as a key
role.  HPC will also figure in the Administration's industrial
extension program, in which the states have the primary
operational role.

              D. NSF AND THE NATIONAL HPC EFFORT;
                 RELATIONSHIPS WITH THE STATES

Recommendation D-1: We recommend that NSF urge OSTP to establish
an advisory committee representing the states, HPC users, NSF
Centers, computer manufacturers, computer and computational
scientists (similar to the Federal Networking Councilţs Advisory
Committee), which should report to HPCCIT. A particularly
important role for this body would be to facilitate state-federal
planning related to high performance computing.

                            Teraflop
                              class

                             Center
                         supercomputers

                 Mid-range parallel processors;
                     Networked work stations

                  High performance workstations

                            Figure A
                   PYRAMID OF HIGH PERFORMANCE
                     COMPUTING ENVIRONMENTS


                            National
                            Teraflop
                            facility

                         NSF HPC Centers
                   Other agency, State Centers

              Departments, institutes, laboratories

               Subject specific, Computer Science,
          Computational Science and Engineering Groups

            Individual investigators and small groups

                            Figure B
                   PYRAMID OF HIGH PERFORMANCE
                     COMPUTING INSTITUTIONS


INTRODUCTION AND BACKGROUND

A revolution is underway in the practice of science and
engineering, arising from advances in computational science and
new models for scientific phenomena, and made possible by
advances in computer science and technology.  The importance of
this revolution is not yet fully appreciated because of the
limited fraction of the technical community that has developed
the skills required and has access to high performance
computational resources.  These skill and access barriers can be
dramatically lowered, and if they are, a new level of creativity
and progress in science and engineering may be realized which
will be quite different from that known in the past.  This report
is about that opportunity for all of science and engineering; it
is not about the needs of one or two specialized disciplines.

A little over a decade ago, the National Science Board convened a
panel chaired by Prof. Peter Lax to explore what should NSF do to
exploit the potential for science and industry of the rapid
advances in high performance computing./1 The actions taken by
the NSF with the encouragement of the Board to implement the
"Large Scale Computing in Science and Engineering" Report of 1982
have helped computing foster a revolution in science and
engineering research and practice, in academic institutions and
to a lesser extent in industrial applications. At the time,
centralized facilities were the only way to provide access to
high performance computing, which compelled the Lax panel to
recommend the establishment of NSF Supercomputer Centers
interconnected by a high speed network. The new revolution is
characterized both by advances in the power of supercomputers and
by the diffusion throughout the nation of access to and
experience with using high performance computing./2 This success
has opened up a vast set of new research and applications
problems amenable to solution through high levels of
computational power and better computational tools.


----------
1/Report of the Panel on Large Scale Computing in Science and
Engineering, Peter Lax, chairman, commissioned by the National
Science Board in cooperation with the U.S. Department of Defense,
Department of Energy, and the National Aeronautics and Space
Administration, December 26, 1982.

2/With every new generation of computing machines, the capability
associated with "high performance computing" changes. High
performance computing (HPC) may be defined as "a computation and
communications capability that allows individuals and groups to
extend their ability to solve research, design, and modelling
problems substantially beyond that available to them before."
This definition recognizes that HPC is a relative and changing
concept. For the PC user a scientific workstation is high
performance computing. For the technical people with specialized
skill in computational science and access to high performance
facilities, a reasonable level for 1992-1993 might be 1 Gflop for
a vector machine and 2 Gflops for a MPP system.


The key features of the new capabilities include:

*    The power of the big, multiprocessing vector supercomputers,
today's workhorse of supercomputing, has increased by a factor of
100 to 200 since the Lax Report./3


----------
3/As noted in Appendix C, the clock speed of a single vector
processor has only increased by a factor of 5 to 6 since 1976,
but a 16-way Cray C-90 with one additional vector pipe multiplies
the effective speed by the estimated factor of a hundred or more.


*    An exciting array of massively parallel processors (MPP)
have appeared in the market, offering three possibilities: an
acceleration in the rate of advance of peak processing power, an
improvement in the ratio of performance to cost, and the option
to grow the power of an installation incrementally as the need
arises./4

----------
4/The promise (not yet realized) of massively parallel systems is
a much higher degree of installed capacity expandability with
minimal disruption to the user's programming.


*    Switched networks based on high speed digital communications
are extending access to major computational facilities,
permitting the dynamic redeployment of computing power to suit
the users' needs, and improving connectivity among collaborating
users.

*    Technical progress in computer science and microelectronics
have transformed yesterday's supercomputers into today's emerging
desktop workstations. These workstations offer more flexible
tradeoffs between ease of access and inherent computing power and
can be coupled to the largest supercomputers over a national
network, used in locally-networked clusters, or as stand-alone
processors.

*    Advances in computer architectures, computational
mathematics, algorithmic modeling, and software, along with new
computer architectures, are solving some of the most intractable
but important scientific, technical, and economic problems facing
our society.

To address these changes, the National Science Board charged this
panel with taking a fresh look at the current situation and new
directions that might be required. (See Appendix A for
institutional identification of the panel membership and Appendix
B for historical background leading to the present study and the
Charge to the Panel.)

To provide both direction and potential to exploit these
advances, a leadership role for the NSF continues to be required.
The goal of this report is to suggest how NSF should evolve its
role in high performance computing.  Our belief that NSF can and
should continue to exert influence in these fields is based in
part on its past successes achieved through the NSF Program in
High Performance Computing and Communications.

Achievements Since the Lax Report

In the past 10 years, the NSF Program in High Performance
Computing and Communications has:

*    Facilitated many new scientific discoveries and new
industrial processes, and supported fundamental work which has
led to advances in architectures, tools and algorithms for
computational science.

     In Appendix E of this report several panel members describe
     examples of those accomplishments and suggest their personal
     visions for what may be even more dramatic progress in the
     future.

*    Supported fundamental work in computer science and
engineering which has led to advances in architectures, tools,
and algorithms for computational science.

*    Initiated collaborations with many companies to help them
realize the economic and technological benefits of high
performance computing.

     Caterpillar Inc. uses supercomputing to model diesel engines
     in an attempt to reduce emissions.

     Dow Chemical Company simulates and visualizes fluid flow in
     chemical processes to ensure complete mixing.

     USX has turned to supercomputing to improve the hot rolling
     process-control systems used in steel manufacturing.

     Solar Turbine, Inc. applies computational finite-element
     methods to the design of very complex mechanical systems.

*    Opened up supercomputer access to a wide range of
researchers and industrial scientists and engineers.

     This was one of the key recommendations of the Lax Report.
     The establishment of the four NSF Supercomputer Centers (in
     addition to NCAR) has been extraordinarily successful. By
     providing network access, through the NSFNET and Internet
     linkages, NSF has put these computing resources at the
     fingertips of scientists, engineers, mathematicians and
     other professionals all over the nation. Users seldom need
     to go personally to these Centers; in fact, the distribution
     of computational cycles by the four NSF Supercomputer
     Centers shows surprisingly little geographic bias. This
     extension of compute power, away from dedicated, on-site
     facilities and towards a seamless national computing
     environment has been instrumental in creating the conditions
     required for advances on a broad front in science,
     engineering, and the tools of computational science.  There
     seems to be a lack of geographic bias in users - Figure 1 in
     Appendix D shows users widely distributed across the United
     States.

*   Educated literally thousands of scientists, engineers and
students, as well as a new generation of researchers who now use
computational science equally with theory and experiment.

     At the time of the Lax Report access to the most advanced
     facilities was restricted to a relatively small set of
     users. Furthermore supercomputing was regarded by many
     scientists as either an inaccessible tool or as an
     inelegantly brute force approach to science.  The NSF
     program successfully inoculated virtually all of the
     disciplines with the realization that HPC is both a powerful
     and a practical tool for many purposes. These NSF
     initiatives have not only pushed the technology and
     computational science ahead in sophistication and power,
     they have helped bring high performance computing to a large
     fraction of the technical community. There has been a 5-fold
     increase in number of NSF funded scientists using HPC and a
     5-fold increase in ratio of graduate students to faculty
     using HPC through the NSF Supercomputer Centers. (See Figure
     2 of Appendix D)

*    Provided the HPC industry a committed, enthusiastic, and
dedicated class of expert users who share their experience and
ideas with vendors, accelerating the evolutionary improvement in
the technology and its software.

     One of the problems in the migration of new technologies
     from experimental environments to production modes are the
     inherent risks in committing substantial resources towards
     converting existing codes and developing software tools. The
     NSF Supercomputer Centers have provided a proving ground for
     these new technologies; various industrial players have
     entered into partnerships with the Centers aimed at
     accelerating this migration while maintaining solid and
     reliable underpinnings.

*    Encouraged the Supercomputer Centers to leverage their
relationship with HPC producers to reduce the cost of bringing
innovation to the scientific and engineering communities.

     In recognition of Center activity in improving early
     versions of hardware and software for  high performance
     computing systems, the computer industry has provided
     equipment at favorable prices and important technical
     support.  This has allowed researchers earlier and more
     useful access to HPC facilities than might have been the
     case under commercial terms.

*    Joined into successful partnerships with other agencies to
make coordinated contributions to the U.S. capability in HPC.

     A decade ago the United States enjoyed a world-wide
     commercial lead in vector systems. In part as the result of
     more recent development and procurement actions of the
     Advanced Projects Agency, the Department of Energy, and the
     National Science Foundation, the U.S. now has the dominant
     lead in providing new Massively Parallel Processing (MPP)
     systems./5  As an example, the NSF has enabled NSF
     Supercomputer Center acquisitions of scalable parallel
     systems first developed under seed money provided by ARPA,
     and thus has been instrumental in leveraging ARPA projects
     into the mainstream./6  (Figure 3 of Appendix D shows data
     on the uptake of advanced computing by sector across the
     world).


----------
5/Massively parallel computers are constructed from large numbers
of separate processors linked by high speed communications
providing access to each other and to shared I/O devices and/or
computer memory.  There are many different architectural forms of
MPP machines, but they have in common economies of scale from the
use of microprocessors produced at high volumes and the ability
to combine them at many levels of aggregation. The challenge in
using such machines is to formulate the problem so that it can be
decomposed and run efficiently on most or all of the processors
concurrently.  Some scientific problems lend themselves to
parallel computation much more easily than others, suggesting
that improved utility of MPP machines will not be availed in all
fields of science at once.

6/Scalable parallel machines are those in which the number of
processor nodes can be expanded over a wide range without
substantial changes in either the shared hardware or the
application interfaces of the operating system.

The Lax Report

All of these accomplishments have, in a large part, arisen from
the response by NSF to the recommendations of the 1982 Lax report
"Large Scale Computing in Science and Engineering". These
recommendations included:

*    Increase access to regularly upgraded supercomputing
facilities via high bandwidth networks.

*    Increase research in computational mathematics, software,
and algorithms.

*    Train people in scientific computing.

*    Invest in research on new supercomputer systems.

For several reasons, NSF's investment in computational research
and training has been a startling success.  First, there has been
a widespread acceptance of computational science as a vital
component and tool in scientific and technological understanding.
Second, there have been revolutionary advances in computing
technology in the past decade.  And third, the demonstrated
ability to solve key critical problems has advanced the progress
of mathematics, science and engineering in many important ways,
and has created great demand for additional HPC resources.

The New Opportunities in Science and in Industry

As discussed in detail in the Appendix E essays, the prospects
are for dramatic progress in science and engineering and for
rapid adoption of computational science in industry. The next
major HPC revolution may well be in industry, which is still
seriously under-utilizing HPC (with some exceptions such as
aerospace, automotive, and microelectronics).

     The success of the chemical industry in designing and
     simulating pilot plants, of the aircraft industry in
     simulating wind tunnels and performing dynamic design
     evaluation, and in the electronics industry in designing
     integrated circuits and modelling the performance of
     computers and networks suggests the scale of available
     opportunities.  The most important requirements are (a)
     improving the usability and efficiency problems of high
     performance machines, and (b) training in HPC for people
     going into industry. The Supercomputer Centers have
     demonstrated they can introduce the commercial sector to HPC
     at little cost, and with high potential benefits to economy
     (productivity of industry and stimulation of markets for
     U.S. HPC vendors). Success in stimulating HPC usage in
     industry will also accelerate need for HPC education and
     technology, thus exploiting the benefits of collaboration
     with universities and vendors. The Centersţ role can be a
     catalytic one, but often rises to the level of a true
     collaborative partnership with industry, to the mutual
     advantage of the firm and the NSF Centers. As industrial
     uses of HPC grow, the scientists, mathematicians, and
     engineers benefit from the falling costs and rising
     usability of the new equipment. In addition the
     technological uses of HPC spur new and interesting problems
     in science.  The following chart indicates the increasing
     importance of advanced computing in industry.

Cray Research Inc. supercomputer sales
_________________________________________________________________
     Era       Percent to     Percent to     Percent to
               government     industry       Universities
_________________________________________________________________
Early 1980s       70             25              5
Late 1980s        60             25             15
Today             40             40             20
_________________________________________________________________


The New Technology

Most HPC production work being done today uses big vector
machines in single processor (or loosely coupled multiprocessor)
mode. Vectorizing Fortran compilers and other software tools are
well tested and many people have been trained in their use.
These big shared memory machines will continue to be the mainstay
of high performance computing, at least for the next 5 years or
so, and perhaps beyond if the promise of massively parallel
supercomputing is delayed longer than many expect.

New desk top computers have made extraordinary gains in cost-
performance (driven by competition-driven commodity
microprocessor production). Justin Rattner of Intel estimated
that in 1996 microprocessors with clock speeds of 200 MHz may
power an 800 Mflops peak speed workstation./7 He, and others from
the industry, predicted the convergence of the clock speeds of
microprocessor chips and the large vector machines such as the
Cray C90, perhaps as soon as 1995. They held out the likelihood
that in 1997 microprocessors may be available at 1 gigaflop; a
desktop PC might be available with this speed for $10,000 or
less.  Mid-range workstations will also show great growth in
capacity; Today one can purchase a mid-range workstation with a
clock speed of 200MHz for an entry price of $40,000 to $50,000.


----------
7/The instruction execution speeds of scientific computers are
generally reckoned in the number of floating point instructions
that can be executed in one second. Thus a 1 Megaflop machine
executes 1 million floating point instructions per second, a
Gigaflop would be one billion instructions per second, and a
Teraflop 10[superscript 12] floating point instructions per
second. Since different computer architectures may have quite
different instruction sets one "flop" may not be the same as
another, either in application power or in the number of machine
cycles required. To avoid such difficulties, those who want to
compare machines of different architecture generally use a
benchmark suite of test cases to measure overall performance on
each machine.


Thus a technical transition is underway from the world in which
uniprocessor supercomputers were distinguished from desktop
machines by having much faster cycle times, to a world in which
cycle times converge and the highest levels of computer power
will be delivered through parallelism, memory size and bandwidth,
and I/O speed.  The widespread availability of scientific
workstations will accelerate the introduction of more scientists
and engineers to high performance computing, resulting in a
further acceleration of the need for higher performance machines.
Early exploration of message-passing distributed operating
systems gives promise of loosely-coupled arrays of workstations
being used to process large problems in the background and when
the workstations are unused at night, as well as coupling the
workstations (on which problems are initially designed and
tested) to the supercomputers located at remote facilities.

Of course, the faster microprocessors also make possible new MPP
machines of ever increasing peak processing speed.  MPP is
catching on fast, as researchers with sufficient expertise (and
diligence) in computational science are solving a growing number
of applications that lend themselves to highly parallel
architectures. In some cases those investigators are realizing a
ratio of theoretical to peak performance approaching that
achieved by vector machines, with significant cost-performance
advantages. Efficient use of MPP on the broad range of scientific
and engineering problems is still beyond the reach of most
investigators, however, because of the expertise and effort
required. Thus the first speculative phase of MPP HPC is coming
to an end, but its ultimate potential is still uncertain and
largely unrealized.

Limiting progress in all three of these technologies is a set of
architecture and software issues that are discussed below in
Recommendations B. Principal among them is the evolution of a
programming model that can allow portability of applications
software across architectures.

These technical issues are discussed at greater length in
Appendix C.


FOUR CHALLENGES FOR NSF

High performance computing is changing very fast, and NSF policy
must chase a moving target.  For that reason, the strategy
adopted must be agile and flexible in order to capitalize on past
investments and adapt to the emerging opportunities. The Board
and the Foundation face four central challenges, on which we will
make specific recommendations for policy and action.  These
challenges are:

*    Removing barriers to the rapid evolution of HPC capability

*    Providing scalable access to all levels of HPC capability

*    Finding the right incentives to promote access to all three
     levels of the computational power pyramid

*    Creating NSFţs intellectual and management leadership for
     the future of high performance computing in the U.S.

CHALLENGE NO. 1: Removing Barriers to the Rapid Evolution of HPC
Capability

How can NSF, as the nation's premier agency funding basic
research, remove existing barriers to the rapid evolution of High
Performance Computing?  These barriers are of two kinds:
technological barriers (primarily to realizing the promise of
highly parallel machines, workstations, and networks) and
exploitation barriers (new mathematical methods and new ways to
formulate science and engineering problems for efficient and
effective computation).  An aggressive commitment by NSF to
leadership in research and prototype development, in both
computer science and computational science, will be required.
Indeed, NSF's position as the leading provider of HPC capability
to the nation's scientists and engineers will be strengthened if
it commands a leadership role in technical advances in both
areas, which will contribute to the nation's economic position as
well as its position as a world leader in research.

     Computer Science and Engineering.  The first challenge is to
     accelerate the development of the technology underlying high
     performance computing.  Among the largest barriers to
     effective use of the emerging HPC technologies are parallel
     architectures from which it is easy to extract peak
     performance, system software (operating systems, databases
     of massive size, compilers, and programming models) to take
     advantage of these architectures and provide portability of
     end-user applications, parallel algorithms, and advances in
     visualization techniques to aid in the interpretation of
     results. The technical barriers to progress are discussed in
     Appendix C. What steps will most effectively reduce these
     barriers?

     Computational Tools for Advancing Science and Engineering.
     Research in the development of computational models, the
     design of algorithmic techniques, and their accompanying
     mathematical and numerical analysis, is required in order to
     ensure the continued evolution of efficient and accurate
     computational algorithms designed to make optimal use of
     these emerging technologies.

     In the past ten years, exciting developments in computer
     architectures, hardware and software have come in tandem
     with stunning breakthroughs in computational techniques,
     mathematical analysis, and scientific models. For example,
     the potential of parallel machines has been realized in part
     through new versions of numerical linear algebra routines
     and multi-grid techniques; rethinking and reformulating
     algorithms for computational physics within the domain of
     parallel machines has posed significant and challenging
     research questions. Advances in such areas of N-body
     solvers, fast special function techniques, wavelets, high
     resolution fluid solvers, adaptive mesh techniques, and
     approximation theory have generated highly sophisticated
     algorithms to handle complex problems. At the same time,
     important theoretical advances in the modelling of
     underlying physical and engineering problems have led to
     new, efficient and accurate discretization techniques.
     Indeed, in the evolution to scalable computing across a
     range of levels, designing appropriate numerical and
     computational techniques is of paramount importance. The
     challenge facing NSF is to weave together existing work in
     these areas, as well as fostering new bridges between pure,
     applied and computational techniques, engaging the talents
     of disciplinary scientists, engineers, and mathematicians.

CHALLENGE NO. 2: Providing scalable access to all levels of HPC
capability

How can NSF provide scalable access to computing resources, from
the high performance workstations needed by most scientists to
the critically needed teraflop-and-beyond capability required for
solving Grand Challenge problems?/8  What balance should NSF
anticipate and encourage among high performance desktop
workstations, mid-range or mini-supercomputers, networks of
workstations, and remote, shared supercomputers of very high
performance?


----------
8/By scalable access we mean the ability to develop a problem on
a workstation or intermediate sized machine and migrate the
problem with relative efficiency to larger machines as increased
complexity requires it. Scalable access implies scalable
architectures and software.


     Flexible strategy. NSF must ensure that adequate additional
     computational capacity is available to a steadily growing
     user community to solve the next generation of more complex
     science and engineering problems.  A flexible and responsive
     strategy that can support the large number of evolving
     options for HPC and can adapt to the outcomes of major
     current development efforts (for example in MPP systems and
     in networked workstations) is required.

     A pyramid of computational capability. There will continue
     to be an available spectrum spanning almost five orders of
     magnitude of computer capabilities and prices./9 NSF, as a
     leader in the national effort in high performance computing,
     should support a "pyramid" of  computing capability. At the
     apex of the pyramid is the highest performance systems that
     affordable technology permits, established at national
     facilities. At the next level, every major research
     university should have access to one, or a few,
     intermediate-scale high-performance systems and/or
     aggregated workstation clusters./10  At the lowest level are
     workstations with visualization capabilities in sufficient
     numbers to support computational scientists and engineers.

----------
9/A Paragon machine of 300 Gigaflops peak performance would be
five orders of magnitude faster than a 3 megaflop entry
workstation. Effective performance in most science applications
would, however, be perhaps a factor of ten lower.

10/As discussed in the recommendations, dedicated mid-range
systems are required not only for science and engineering
applications but also for research to improve HPC hardware and
software, and for interactive usage. For science and engineering
batch applications, networks of workstations will likely develop
into an alternative.


     Mid-range computational requirements. Over the next five
     years, the middle range of scientific computing and
     computational engineering will be handled by an amazing
     variety of moderately parallel systems. In some cases, these
     will be scaled-down versions of the highest performance
     systems available; in other cases, they will be systems
     targeted at the midrange computing market. The architecture
     will vary from shared memory at one end of the spectrum to
     workstation networks at the other, depending on the types of
     parallelism in the local spectrum of applications.  Loosely
     coupled networks of workstations will compete with mid-range
     systems for performance of production HPC work.  At the same
     time autonomous mid-range systems are needed to support the
     development of next-generation architectures and software by
     computer science groups.

     The panel perceives that there are imbalances in access to
     the pyramid of HPC resources (see table below).   The
     disciplinary NSF program offices have not been uniformly
     effective in responding to the need for a desktop
     environment for their supported researchers, and there is
     serious under-investment in the mid-sized machines. The
     distribution of investment tends to be bimodal, to the
     disadvantage of mid-range systems. The incentive structures
     internal to the Foundation do not address this distortion.
     NSFţs HPCC coordinating mechanism needs to address this
     distortion in a more direct manner.

               Computational Infrastructure at NSF
                           (FY92 $, M)
________________________________________________________________
                         Other NSF      ASC
----------------------------------------------------------------
Workstations              20.1           3.2
Small Parallel             2.1           0.5
Large Parallel             9.4           3.2
Mainframe                  9.1          16.3
----------------------------------------------------------------
     Total               40.8           23.2
----------------------------------------------------------------


CHALLENGE NO. 3: The right incentives to promote access to all
three levels of the computational institution pyramid

The third challenge is to encourage the continued broadening of
the base of participation in HPC, both in terms of institutions
and in terms of skill levels and disciplines.

     Lax Report incentives. At the time of the Lax report,
     relatively few people were interested in HPC; even fewer had
     access to supercomputers. Some users were fortunate to have
     contacts with someone at one of a few select government
     laboratories where computer resources were available. Most,
     however, were less fortunate and were forced to carry out
     their research on  small departmental machines.  This
     severely limited the research that could be carried out to
     problems that would "fit" into available resources. NSF
     addressed this problem by concentrating supercomputer
     resources in Centers; by this means those in the academic
     community most prepared and motivated were provided with
     access to machine cycles.

     Need for expanded scope of access. Now that these resources
     are available on a peer review basis to everyone no matter
     where they work, it is clear the research community cannot
     accept a return to the previous mode of operation.The high
     performance computing community has grown to depend on NSF
     to make the necessary resources available to continually
     upgrade the Supercomputer Centers in support of their
     computational science and engineering applications. NSF
     needs to broaden the base of participation in HPC through
     NSF program offices as well as through the Supercomputer
     Centers. There is no question that HPC has broken out of its
     original narrow group of privileged HPC specialists. The
     SuperQuest competition for high school students already
     demonstrates how quickly young people can master the
     effective use of HPC facilities. Other agencies, states, and
     private HPC centers are springing up, making major
     contributions not only to science but to K-12 education and
     to regional economies. NSF's policies on expanding access
     and training must take advantage of the leverage these
     Supercomputer Centers can provide.

     Allocation of HPC resources. There remains the question of
     the best way to allocate HPC resources. Should Supercomputer
     Centers continue to be funded to allocate HPC cycles
     competitively, or should NSF depend on the "market" of
     funded investigators for allocation of HPC resources?  This
     question gets at two other issues: (a) the future role of
     the Centers and (b) the best means for insuring adequate
     funding of workstations and other means of HPC access
     throughout the NSF.  The Centers have peer review committees
     which allocate HPC resources on the basis of competitive
     project selection.  The Panel believes these allocations are
     fairly made and reflect solid professional evaluation of
     computational merit.  The only remaining issue is whether
     there continues to be a need for protected funding for HPC
     access in NSF, including access to shared Supercomputer
     Centers facilities?  We believe strongly that there is such
     a need.  The panel does have suggestions for broadening the
     support for the remainder of the HPC pyramid; these are
     articulated in the recommendations below.

     Education and training. A major requirement for education
     and training continues to exist. Even though most
     disciplines have been inoculated with successful uses of HPC
     (see Appendix D essays), and even though graduate student
     and postdoctoral uses of HPC resources is rising faster than
     faculty usage, only a minority of scientists have the
     training to allow them to overcome the initial barrier to
     proficiency, especially in the use of MPP machines which
     require a high level of computational sophistication for
     most problems.

CHALLENGE NO. 4: How can NSF best create the intellectual and
management leadership for the future of high performance
computing in the U.S.?

What relationships should NSF's activities in HPC have to the
activities of other federal agencies?  NSF is a major player.
What role should NSF play within the scope of the nationally
coordinated HPCC program and budget, as indicated in the
following chart?

                       HPCC Agency Budgets
_________________________________________________________________
Agency              FY92 Funding ($, M)
-----------------------------------------------------------------
ARPA                     232.2
NSF                      200.9
DOE                       92.3
NASA                      71.2
HHS/NIH                   41.3
DOC/NOAA                   9.8
EPA                        5.0
DOC/NIST                   2.1
_________________________________________________________________


NSF leadership in HPCC. The voice of HPCC users needs to be more
effectively felt in the national program; NSF has the best
contact with this community. NSF has played, and continues to
play, a leadership role in the NREN program and the evolution of
the Internet.  Its initiative in creating the "meta-center"
concept establishes an NSF role in the sharing and coordination
of resources (not only in NSF but in other cooperating agencies
as well), and the concept can be usefully extended to cooperating
facilities at state level and in private firms. The question is,
does the current structure in CISE, the HPCC coordination office,
the Supercomputer Centers, and the science and engineering
directorates constitute the most favorable arrangement for that
leadership?  The panel does not attempt to suggest the best ways
to manage the relationships among these important functions, but
asks the NSF leadership to assure the level of attention and
coordination required to implement the broad goals of this
report.

Networking. The third barrier is the need for network access with
adequate bandwidth.  For wide area networks, this is addressed in
the NSF HPCC NREN strategy.  In the future, NSF will focus its
network subsidies on HPC applications and their supporting
infrastructure, while support for basic Internet connectivity
shifts to the research and education institutions./11

----------
11/NREN is the National Research and Education Network,
envisioned in the High Performance Computing Act of 1991. NREN is
not a network so much as it is a program of activities including
the evolution of the Internet to serve the needs of HPC as well
as other information activities.


RECOMMENDATIONS

We have four sets of interdependent recommendations for the
National Science Board and the Foundation.  The first implements
a balanced pyramid of computing environments; each element
supports the others, and as priorities are applied the balance in
the pyramid should be sustained.  The second set addresses the
essential research investments and other steps to remove the
obstacles to realizing the technologies of the pyramid and the
barriers to the effective use of these environments.  The third
set addresses the institutional structure for the delivery of HPC
capabilities, and consists itself of a pyramid.  At the base of
the institutional pyramid is the diverse array of investigators
in their universities and other settings who use all the
facilities at all levels of the pyramid.  At the next level are
departments and research groups devoted to specific areas of
computer science or computational science and engineering.
Continuing upward are the NSF HPC Centers, which must continue to
play a very important role, both as providers of the major
resources of high capability computing systems and as
aggregations of specialized capability for all aspects of use and
advance of high performance computing.  At the apex is the
national teraflop facility, which we recommend as a multi-agency
facility pushing the frontiers of high performance into the next
decade.  A final recommendation addresses the NSF's role at the
national level and its relationship with the states in HPC.

This report recommends significant expansion in NSF investments,
both in accelerating progress in high performance computing
through computer and computational science research and in
providing the balanced pyramid of computing facilities to the
science and engineering communities, but in total they do not
exceed the Administration's stated intent to double the
investments in HPCC during the next 5 years.  We believe these
investments are not only justified, but are compatible with
stated national plans, both in absolute amount and in their
distribution.


A. CENTRAL GOAL FOR NSF HPC POLICY

Recommendation A-1: We strongly recommend that NSF build on its
success in helping the U.S. achieve its preeminent world position
in high performance computing by taking the lead, under OSTP
guidance and in collaboration with ARPA, DoE and other agencies,
to expand access to all levels of the rapidly evolving pyramid of
high performance computing for all sectors of the nation.  The
realization of this pyramid depends, of course, on rapid progress
in the pyramidţs technologies.

     High performance computing is essential to the leading edge
     of U.S. research and development. It will provide the
     intelligence and power that justifies the breadth of
     connectivity and access promised by the NREN and the
     National Information Infrastructure.  The computational
     capability we envision includes not only the research
     capability for which NSF has special stewardship, but also
     includes a rapid expansion of capability in business and
     industry to use HPC profitably and the many operational uses
     of HPC in commercial and military activities.

     The panel is concerned that if the government fails to
     implement the planned HPCC investments to support the
     National Information Infrastructure, the momentum of the
     U.S. industry, which blossomed in the first phase of the
     national effort, will be lost. Supercomputers are only a $2
     billion industry, but an industry that provides critical
     tools for innovation across all areas of U.S.
     competitiveness, including pharmaceuticals, oil, aerospace,
     automotive, and others.  The administration's planned new
     investment of $250 million in HPCC is fully justified.
     Japanese competitors could easily close the gap in the HPC
     sectors in which the U.S. enjoys that lead; they are
     continuing to invest and could capture much of the market
     the U.S. government has been helping to create.

VISION OF THE HPC PYRAMID

Recommendation A-2: At the apex of the HPC pyramid is a need for
a national capability at the highest level of computing power the
industry can support with both efficient software and hardware.

     A reasonable goal for the next 2-3 years would be the
     design, development, and realization of a national teraflop-
     class capability, subject to the effective implementation of
     Recommendation B-1 and the development of effective software
     and computational tools for such a large machine./12 Such a
     capability would provide a significant stimulus to
     commercial development of a prototype high-end commercial
     HPC system of the future. We believe the importance of NSF's
     mission in HPC justifies NSF initiating an interagency plan
     to make this investment, and further that NSF should propose
     to operate the facility in support of national goals in
     science and technology. For budgetary and interagency
     collaboration reasons OSTP should invoke a FCCSET project to
     establish such a capability on a government-wide basis with
     multi-agency funding and usage.

     If development begins in 1995 or 1996, a reasonable guess at
     the cost of a teraflop machine is $ 50/megaflop for delivery
     in 1997 to 1998. If so, $50 million a year might buy one
     such machine per year./13  Development cost would be
     substantial, perhaps in excess of the production cost of one
     machine; although it is not clear to what extent government
     support would be required, this is a further reason to
     suggest a multi-agency program./14  Support costs would also
     be additional, but one can assume that one or more of the
     NSF Supercomputer Centers could host such a facility with
     something like the current staff.


----------
12/Some panel members have reservations about the urgency of this
recommendation, are pessimistic about the likelihood of realizing
the effective performance in applications, or are concerned about
the possible opportunity cost to NSF of such a large project. The
majority notes that the recommendation is intended to drive
solutions to those architectural and software problems. Intel's
Paragon machine is on the market today with 0.3 Teraflops peak
speed, but without the support to deliver that speed in most
applications. The panel also recommends a multi-agency federal
effort. NSF's share of cost and role in managing such a project
are left to a proposed FCCSET review.

13/The cost estimates in this report cannot be much more than
informed guesses. We have assumed a cost of $50/megaflop for
purchase of a one teraflop machine in 1997 or 1998. We suspect
that this cost might be reached earlier, say in 1995 or 1996 in a
mid-range machine, because a tightly-coupled massively parallel
machine may have costs rising more than linearly with the number
of processors, overcoming the scale economies that might make the
cost rise less than linearly. The cost estimates in
recommendations A2-4 are intended to indicate that scale of
investment we recommend is not incompatible with the published
plans of the administration for investment in HPCC in the next 5
years, and further that roughly equal levels of incremental
expenditures in the three levels of the HPC pyramid could produce
the balance among these levels that we recommend.

14/The Departments of Energy and Defense and NASA might share a
major portion of the development cost and might also acquire such
machines in the future as well.


     Such a nationally shared machine, or machines, must be open
     to competitive merit-evaluated proposals for science and
     engineering computation, although it could share this
     mission of responding to the community's research priorities
     with mission-directed work of the sponsoring agencies.  The
     investment is justified by (a) the existence of problems
     whose solution awaits a teraflop machine, (b) the importance
     of driving the HPC industry's innovation rate, (c) the need
     for early and concrete experience with the scalability of
     software environments to higher speeds and larger arrays of
     processors, since software development time is the limiting
     factor to hardware acceptance in the market.

Recommendation A-3: Over a period of 5 years the research
universities should be assisted to acquire mid-range machines.

     This will bring a rapid expansion in access to very robust
     capability, reducing pressure on the Supercomputer Centers'
     largest facilities, and allowing the variety of vendor
     solutions to be exercised extensively.  If the new MPP
     architectures prove robust, usable, and scalable, these
     institutions will be able to grow the capacity of such
     system in proportion to need and with whatever incremental
     resources are available. This capability is also needed to
     provide testbeds for computer and computational science
     research and testing.

     These mid-sized machines are the underfunded element today -
     - less than 5% of NSF's FY92 HPC budget is devoted to their
     acquisition. They are needed for both demanding science and
     engineering problems that do not require the very maximum in
     computing capacity, and importantly for use by the computer
     science and computational mathematics community in
     addressing the architectural, software, and algorithmic
     issues that are the primary barriers to progress with MPP
     architectures./15

----------
15/The development of prototypes of architectures and operating
systems for parallel computation requires access to a machine
whose hardware and software can be experimentally modified. This
research often cannot be done on machines dedicated to full time
production.


     Engineering is also a key candidate for their use. There are
     1050 University-Industry Research Centers in the U.S.  Those
     UIRCs that are properly equipped with computational
     facilities can increase the coupling with industrial
     computation, adding greatly to what the NSF HPC
     Supercomputer Centers are doing. Many engineering
     applications, such as robotics research, require "real time"
     interactive computation which is incompatible with the batch
     environment on the highest performance machines.

     If we assume a cost in three or four years of $50/megaflop
     for mid-sized MPP machines, an annual expenditure of $10
     million would fund the annual acquisition of one hundred 2
     Gigaflop (peak) computers. Support costs for users would be
     additional.

Recommendation A-4:  We recommend that NSF double the current
annual level of investment ($22 million) in scientific and
engineering workstations for its 20,000 principal investigators.

     Many researchers strongly prefer the new high performance
     workstations that are under their control and find them
     adequate to meet many of their initial needs. Those without
     access to the new workstations may apply to use remote
     access to a supercomputer in a Center, but often they do not
     need all the I/O and other capabilities of the large shared
     facilities. NSF needs a strategy to off-load work not
     requiring the highest level machines in the Centers. The
     justification is not economy of scale, but economy of talent
     and time.

     When the Lax report was written a 160 Mflop peak Cray 1 was
     a high performance supercomputer.  Within 4 or 5 years
     workstations delivering up to 400 megaflops costing no more
     than $15,000 to $20,000 should be widely available. For
     education and a large fraction of the computational needs of
     science and engineering, these facilities will be adequate.
     However, once visualization of computational output becomes
     routinely required they will be ubiquitously needed. With
     the rapid pace of improvement, the useful lifetimes of
     workstations are decreasing rapidly; they often cannot cope
     with the latest software. Researchers face escalating costs
     to upgrade their computers.  NSF supports perhaps some
     20,000 principal investigators. Equipping an additional 10
     percent of this number each year (2,000 machines) at $20,000
     each requires an incremental $20 million.  Recommendation B-
     3 addresses how this investment might be managed.

Recommendation A-5: We recommend that NSF expand its New
Technologies program to support expanded testing of the
practicality of new parallel configurations for HPC
applications./16  For example, networks of workstations may meet
a significant part of midrange HPC science and engineering
applications.  As progress is made in the development of this and
other technologies, experimental use of the new configurations
should be encouraged. A significant supplement to HPC
applications research capacity can be had with minimal additional
cost if such collections of workstations prove practical and
efficient.

     There have already been sufficient experiments with use of
     distributed file systems and loosely coupled workstations to
     encourage the belief than many compute-intensive problems
     are amenable to this approach.  For those problems that do
     not suffer from the latency inherent in this approach the
     incremental costs can be very low indeed, for the problems
     run in background and at times the workstations are
     otherwise unengaged.  There are those who strongly believe
     that in combination with object-oriented programming this
     approach can create a revolution in software and algorithm
     sharing as well as more economical machine cycles./17

----------
16/Today NSF CISE has a "new technologies" program that co-funds
with disciplinary program offices perhaps 50 projects/yr.  This
program is in the division that funds the Centers, but is focused
on projects which can ultimately benefit all users of parallel
systems. This program funds perhaps 15 methods and tools projects
annually, in addition to those co-funded with science programs.

17/MITRE Corporation, among others, is pursuing this vision.


B. RECOMMENDATIONS TO IMPLEMENT THESE GOALS

REMOVING BARRIERS TO HPC TECHNICAL PROGRESS AND HPC USAGE

Recommendation B-1: To accelerate progress in developing the HPC
technology needed by users, NSF should create, in CISE, a
challenge program in computer science with grant size and
equipment access sufficient to support the systems and algorithm
research needed for more rapid progress in HPC. The Supercomputer
Centers, in collaboration with hardware and software vendors, can
provide test platforms for much of this work. Recommendation A-3
provides the hardware support required for initial development of
prototypes.

     There is consensus that the absence of sufficient funding
     for systems and algorithms work which is not mission-
     oriented is the primary barrier to lower cost, more widely
     accessible, and more usable massively parallel systems. This
     work, including bringing the most promising ideas to
     prototype stage for effective transfer to the HPC industry,
     would address the most significant barriers to the ultimate
     penetration of parallel architectures in workstations.
     Advances on the horizon that could be accelerated include
     more advanced network interface architectures and operating
     systems technologies to provide low overhead communications
     in collections of workstations, and advances in algorithms
     and software for distributed databases of massive size.
     Computer science has made, and continues to make, important
     contributions to both hard and soft parallel machine
     technology, and has effectively transferred these ideas to
     the industry.

     Two problems impede the full contribution of computer
     science to rapid advance in MPP development; grant sizes in
     the discipline are typically too small to allow enough
     concentrated effort to build and test prototypes, and too
     few computer science departments have access to a mid-sized
     machine on which systems development can be done.

     The Board should ask for a proposal from CISE to effectively
     mobilize the best computer science and computational
     mathematics talent to addressing the solution of these
     problems in the areas of both improved operating systems,
     architectures, compilers, and algorithms for existing
     systems as well as research in next-generation systems.  We
     recommend establishing a number of major projects, with
     higher levels of annual funding than is typical in Computer
     Science, and assured duration of up to five years, for a
     total annual incremental investment of $10 million. We
     recommend that this challenge fund be managed by CISE, and
     be accessible to all disciplinary program offices who wish
     to forward team proposals for add-on funding in response to
     specific proposals from the community.

Recommendation B-2: A significant barrier to rapid progress in
the application of HPC lies in formulating a computational
strategy to solve a problem. In response to Challenge 1 above,
NSF should focus attention, both through CISE and through its
disciplinary program offices, on support for the design and
development of computational techniques, algorithmic methodology,
and mathematical, physical and engineering models to make
efficient use of the machines.

     Without such work in both theoretical and applied areas of
     numerical analysis, applied mathematics, and computational
     algorithms, the full benefit of advances in architecture and
     systems software will not be realized.  In particular,
     significantly increased funding of collaborative and
     individual state-of-the-art methodology is warranted, and is
     crucial to the success of high performance computing.  Some
     of this can be done through the individual directorates with
     funds supplemented by HPCC funds; the Grand Challenge
     Applications Group awards are a good first step.

Recommendation B-3: We recommend NSF set up an agency-wide task
force to develop a way to ameliorate the imbalance in the HPC
pyramid - the under-investment in the emerging mid-range
scalable, parallel computers and the inequality of access to
stand-alone (but potentially networked) workstations in the
disciplines. This implementation plan should involve a
combination of funding by disciplinary program offices and some
form of more centralized allocation of NSF resources.

     Some directorates have "infrastructure" programs; others do
     not. Still others fund workstations until they reach the
     "target" set by the HPCC coordination office.  We believe
     that individual disciplinary program managers should
     consider it their responsibility to fund purchase of
     workstations out of their equipment funds.  But we recognize
     that these funds need to be supplemented by HPCC funds. CISE
     has an office which co-funds interdisciplinary applications
     of HPC workstations. We believe this office may require more
     budgetary authority than it now enjoys, to ensure the proper
     balance of program and CISE budgets for workstations.

     Scientific value must be a primary criterion for resource
     allocation. It would be unwise to support mediocre projects
     just because they require supercomputers. The strategy of
     application approval will depend very heavily on funding
     scenarios. If sufficient HPCC funds are made available to
     individual programs for computer usage, then the
     Supercomputer Centers should be reserved for applications
     that cannot be carried out elsewhere, with particular
     priority to novel applications. If individual science
     programs continue to be underfunded relative to large
     centers, the Supercomputer Centers may be  forced into a
     role of supporting less novel or demanding computing
     applications. Under these circumstances, less stringent
     funding criteria should be applied.

C. THE NSF SUPERCOMPUTER CENTERS

Recommendation C-1: The Supercomputer Centers should be retained
and their missions, as they have evolved since the Lax Report,
should be reaffirmed. However, the NSF HPC effort now embraces a
variety of institutions and programs - HPC Centers, Engineering
Research Centers (ERC) and Science and Technology Centers (STC)
devoted to HPC research, and disciplinary investments in computer
and computational science and applied mathematics - all of which
are essential elements of the HPC effort needed for the next
decade.  NSF plays a primary but not necessarily dominant role in
each of them (see Figure 4 of Appendix D). Furthermore, HPC
institutions outside the NSF orbit also contribute to the goals
for which the NSF Supercomputer Centers are chartered. Thus we
ask the Board to recognize that the overall structure of the HPC
program at NSF will have more institutional diversity, more
flexibility, and more interdependence with other agencies and
private institutions than in the early years of the HPC
initiative.

     We anticipate an evolution, which has already begun, in
     which the NSF Supercomputer Centers increasingly broaden
     their base of support, and NSF expands its support in
     collaboration with other institutional settings for HPC.
     Center-like groups, especially NSF S&T Centers, are an
     important instrument for focusing on solving barriers to
     HPC, although they do not provide HPC resources to users. An
     excellent example is the multi-institutional Center for
     Research in Parallel Computation at Rice University, which
     is supported at about $4M/yr, with additional support from
     ARPA. Another example is the Center for Computer Graphics
     and Scientific Visualization, an S&T Center award to
     University of Utah with participation of University of N.
     Carolina, Brown, Caltech, and Cornell. Still another example
     is the Discrete Mathematics and Computational Science Center
     (DIMACS) at Rutgers and Princeton. These centers fill
     important roles today, and the ERC and S&T Center structures
     provide a necessary addition to the Supercomputer Centers
     for institutionalizing the programmatic work required for
     HPC.

     The NSF should continue its current practice of encouraging
     HPC Center collaboration, both with one another and with
     other entities engaged in HPC work. The division of the
     support budget into one component committed to the
     Supercomputer Centers and another for multi-center
     activities is a useful management tool, even though it may
     have the effect of reducing competition among Supercomputer
     Centers.  The National Consortium for HPC (NCHPC), formed by
     NSF and ARPA is a welcome measure as well.

Recommendation C-2:  The current situation in HPC is more
exciting, more turbulent, and more filled with promise of really
big benefits to the nation than at any time since the Lax report;
this is not the time to "sunset" a successful, changing venture,
of which Supercomputer Centers remain an essential part.
Furthermore, we also recommend against an open recompetition of
the four Supercomputer Centers at this time, favoring instead
periodic performance evaluation and competition for some elements
of their activities, both among the Centers themselves and when
appropriate with other HPC Centers such as those operated by
states (see Recommendation D-1).

     Continuing evaluation of each Center's performance, as well
     as the performance of the overall program, is, of course, an
     essential part of good management of the Supercomputer
     Centers program. Such evaluations must take place on a
     regular basis in order to develop a sound basis for
     adjustments in support levels, to provide incentives for
     quality performance and to recognize the need to encourage
     other institutions such as S&T Centers that are attacking
     HPC barriers and state-based centers with attractive
     programs in education and training. While recompetition of
     existing Supercomputer Centers does not appear to be
     appropriate at this time, if regular review of the Centers
     and the Centers program identifies shortcomings in a Center
     or the total program, a recompetition of that element of the
     program should be initiated.

     Supercomputer Centers are highly leveraged through
     investments by industry, vendors, and states. This
     diversification of support impedes unilateral action by NSF,
     since the Centers' other sponsors must be consulted before
     decisions important to the Center are made./18 It also
     suggests that the issue of recompetition may, in future,
     become moot as the formal designation "NSF HPC Center"
     erodes in significance. There is a form of recompetition
     already in place; the Centers compete for support for new
     machine acquisition and for roles in multi-center projects.

----------
18/Each year each center gets a cooperative agreement level which
is negotiated. Each center gets about $ 14M; about 15% is
flexible. NSF centers have also received help from ARPA to buy
new MPP machines. Most of the Centers have important outside
sources of support, which imply obligations NSF must respect,
such as the Cornell Center relationship with IBM and the San
Diego Center's activities with the State of California.


Recommendation C-3:  The NSF should continue to provide funding
to support the Supercomputer Centers' HPC capacity. Any
distortion in the uses of the computing pyramid that result from
this dedicated funding are best offset by the recommendations we
make for other elements in the pyramid. Provision to scientists
and engineers of access to leading edge supercomputer resources
will continue to be a primary purpose of the Centers, but it is a
means to a broader mission; to foster rapid progress in the use
of HPC by scientists and engineers, to accelerate progress in
usability and economy of HPC and to diffuse HPC capability
throughout the technical community, including industry. The
following additional components of the Center missions should be
affirmed:

     *    Supporting computational science, by research and
     demonstration in the solution of significant science and
     engineering problems.

     *    Fostering interdisciplinary collaboration - across
     sciences and between sciences and computational science and
     computer science - as in the Grand Challenge programs.

     *    Prototyping and evaluating software, new architectures,
     and the uses of high speed data communications in
     collaboration with three groups: computer and computational
     scientists, disciplinary scientists exploiting HPC
     resources, the HPC industry, and business firms exploring
     expanded use of HPC.

     *    Training and education, from post-docs and faculty
     specialists to introduction of less experienced researchers
     to HPC methods, to collaboration with state and regional HPC
     centers working with high schools, community colleges,
     colleges, and universities.

     The role of a Supercomputer Center should, therefore,
     continue to be primarily one of a facilitator, pursuing the
     goals just listed by making the hardware and human resources
     available to computational scientists, who themselves are
     intellectual leaders. In this way the Centers will
     participate in leadership but will not necessarily be its
     primary source. With certain notable exceptions,
     intellectual leadership in computational science has come
     from scientists around the country who have at times used
     the resources available at the Centers. This situation is
     unlikely to change nor should it change. It would be
     unrealistic to place this type of demand on the
     Supercomputer Centers and it would certainly not be in the
     successful tradition of American science.

     The Supercomputer Centers facilitate interdisciplinary
     collaborations because they support users from a variety of
     disciplines, and are aware of their particular strengths.
     The Centers have been deeply involved in nucleating Grand
     Challenge teams, and particularly in reaching out to bring
     computer scientists together with computational scientists.
     Visualization, for example, is no longer just in the realm
     of the computational scientist; experimentalists use the
     same tools for designing and simulating experiments in
     advance of actual data generation.  This common ground
     should not be separated from the enabling technologies which
     have made this work possible.  Rather high performance
     computing and the new science it has enabled have seeded
     advances that would not have happened any other way.

ALLOCATION OF CENTER HPC RESOURCES TO INVESTIGATORS

Recommendation C-4: The NSF should review the administrative
procedures used to allocate Center resources, and the
relationship of this process to the initial funding of the
research by the disciplinary program offices, to ensure that the
burden on scientists applying for research support is minimized,
when that research also requires access to the facilities of the
Centers, or perhaps access to other elements of the HPC pyramid
that will be established pursuant to our recommendations. However
we believe the NSF should continue to provide HPC resources to
the research community through allocation committees that
evaluate competitively proposals for use of Center resources./19

----------
19/For NSF funded investigators, allocation committees at
Supercomputer Centers should evaluate requests for HPC resources
only on the appropriateness of the computational plans, choice of
machine, and amount of resource requested.  Centers should rely
on disciplinary program office determinations of scientific
merit, based on their peer review.  In this way a two level
review of the merits of the science is avoided. A further
simplification might be for the application for computer time at
the Centers to be included in the original disciplinary proposal,
and forwarded to the Centers when the proposal is approved.  For
non-NSF funded investigators an alternative form of peer review
of the research is required.


     At the present time, the allocation of resources in the
     Supercomputer Centers for all users is handled by requiring
     principal investigators to submit annual proposals to a
     specified Center for access to specific equipment. The NSF
     should not require a duplicate peer review of the
     substantive scientific merit of the proposed scientific
     investigation, first by disciplinary program offices, and
     then again by the Center Allocation Committees. For this
     reason, it is proposed that the allocation of supercomputer
     time be combined with the allocation of research funds to
     the investigator.

     Although this panel is not in a position to give
     administrative details of such a procedure, it is suggested
     that requests for computer time be attached to the original
     regular NSF proposal, with (a) experts in computational
     science included among peer reviewers, or, (b) that portion
     of the proposal be reviewed in parallel by a peer review
     established by the Centers. In either case only one set of
     peer reviewers should evaluate scientific merits, and only
     one set of reviewers should determine that the research task
     is being formulated properly for use of HPC resources.

     Second, we recommend that the Centers collectively establish
     the review and allocation mechanism, so that while
     investigators might express a preference for a particular
     computer or Center for their work, all Centers facilities
     would be in the pool from which each investigator receives
     allocations.

     We recognize, of course, that the specific allocation of
     machine time often cannot be made at the time of the
     original proposal for NSF research support, since in some
     cases the work has not progressed to the point that the
     mathematical approach, algorithms, etc., are available for
     Center experts to evaluate and translate into estimates of
     machine time. Nor is the demand function for facilities
     known at that time.

EDUCATION AND TRAINING

Recommendation C-5: The NSF should give strong emphasis to its
education mission in HPC, and should actively seek collaboration
with state-sponsored and other HPC centers not supported
primarily on NSF funding. Supercomputing regional affiliates
should be candidates for NSF support, with education as a key
role.  HPC will also figure in the Administration's industrial
extension program, in which the states have the primary
operational role.

     The serious difficulties associated with the use of parallel
     computers pose a new training burden. In the past it was
     expected that individual investigators would port their code
     to new computers and this could usually be done with limited
     effort. This is no longer the case. The Supercomputer
     Centers should see their future mission as providing direct
     aid to the rewriting of code for parallel processors.

     Computational science is proving to be an effective way to
     generate new knowledge.  As part of its basic mission, NSF
     needs to teach scientists, engineers, mathematicians, and
     even computer scientists how high performance computing can
     be used to produce new scientific results.  The role of the
     Supercomputer Centers is critical to such a mission since
     the Centers have expertise on existing hardware and software
     systems, modelling, and algorithms, as well as knowledge of
     useful high performance computing application packages,
     awareness of trends in high performance computing and
     requisite staff.

D. NSF AND THE NATIONAL HPC EFFORT;
RELATIONSHIPS WITH STATES

Recommendation D-1: We recommend that the National Science Board
urge OSTP to establish an advisory committee representing the
states, HPC users, NSF Supercomputer Centers, computer
manufacturers, computer and computational scientists (similar to
the Federal Networking Council's Advisory Committee), which
should report to HPCCIT. A particularly important role for this
body would be to facilitate state-federal planning related to
high performance computing.

     Congress required advisory committee reporting to the PMES,
     but the committee has not yet been implemented. The
     committee we propose would provide policy level advice and
     coordination with the states.  The main components of HPCC
     are networking and HPC, although the networks seem to be
     receiving priority attention. The Panel believes it is
     important to continue to emphasize the importance of
     ensuring adequate compute power in the network to support
     the National Information Infrastructure applications. We
     also believe that as participation in HPC continues to
     broaden through initiatives by the states and by industry,
     the NSF (and other federal agencies) should encourage their
     collaboration in the national effort.

     The Coalition of Academic Supercomputer Centers (CASC) was
     founded in 1989 to provide a forum to encourage support for
     high performance computing and networking.  Unlike the
     FCCSET task force, CASC is dependent on others to bring the
     money to support high performance computing - usually their
     own State government or university.  The result is a
     valuable discussion group for exchanging information and
     developing a common agenda and CASC should be encouraged.
     However, CASC is not a substitute for a more formal federal
     advisory body.

     This recommendation is consistent with a recent Carnegie
     Commission Report entitled "Science, Technology and the
     States in America's Third Century," which recommends the
     creation of a system of joint advisory and consultative
     bodies to foster federal-state exchanges and to create a
     partnership in policy development, especially for
     construction of national information infrastructure and
     provision of services based on it. Because of the importance
     of high performance computing to future economic
     development, we need a new balance of cooperation between
     federal and state government in this area, as in a number of
     others.

Appendix A
                        MEMBERSHIP OF THE
         BLUE RIBBON PANEL ON HIGH PERFORMANCE COMPUTING

Lewis Branscomb, John F. Kennedy School of Government, Harvard
University (Chairman)
          Lewis Branscomb is a physicist, formerly chairman of
          the National Science Board (1980-1984) and Chief
          Scientist of IBM Corp. (1972-1986).

Theodore Belytschko, Department of Civil Engineering,
Northwestern University
          Ted Belytschkoţs research interests are in
          computational mechanics, particularly in the modeling
          of nonlinear problems, such as failure,
          crashworthiness, and manufacturing processes.

Peter R. Bridenbaugh, Executive Vice President - Science,
Engineering, Environment, Safety &
     Health, Aluminum Company of America
          Peter Bridenbaugh serves on a number of university
          advisory boards, and is a member of the National
          Academy of Engineeringţs Industrial Ecology Committee.
          He also serves on the NSF Task Force 1994 Budget
          Committee and is a Fellow of ASM International.

Theresa Chay, Professor, Department of Biological Sciences,
University of Pittsburgh
          Teresa Chayţs research interests are in modelling
          biological phenomena such as nonlinear dynamics and
          chaos theory in excitable cells, cardiac arrhythmias by
          bifurcation analysis, mathematical modeling for
          electrical activity of insulin secreting pancreatic
          B-cells and agonist-induced cytosolic calcium
          oscillations, and elucidation of the kinetic properties
          of ion channels.

Jeff Dozier, Center for Remote Sensing, University of California,
Santa Barbara
          Jeff Dozier, University of California, Santa Barbara,
          is a hydrologist and remote sensing specialist.  From
          1990-1992 he was Senior Project Scientist on NASA's
          Earth Observing System.

Gary Grest, Exxon Corporate Research Science Laboratory
          Gary Grestţs research interest are in the areas of
          computational physics and material science, recently
          emphasizing the modeling the properties of polymers and
          complex fluids.

Edward Hayes, Vice President for Research, Ohio State University
          Edward F. Hayes is a computational chemist, formerly
          Controller and Division Director for Chemistry at NSF.

Barry Honig, Department of Biochemistry and Molecular Biology,
Columbia University
          Barry Honig's research interests are in theoretical and
          computational studies of biological macromolecules. He
          is an associate editor of the Journal of Molecular
          Biology and is a former president of the Biophysical
          Society (1990-1991).

Neal Lane, Provost, Rice University (resigned from the Panel July
1993)

William A. Lester, Jr., Professor and Associate Dean, Department
of Chemistry, University of
     California, Berkeley
          William A. Lester, Jr., is a theoretical chemist,
          formerly Director of the National Resource for
          Computation in Chemistry (1978-81) and Chairman of the
          NSF Joint Advisory Committees for Advanced Scientific
          Computing and Networking and Communications Research
          and Infrastructure (1987).

Gregory McRae,Professor, Department of Chemical Engineering, MIT

James Sethian, Professor, University of California at Berkeley
          James Sethian is an applied mathematician in the
          Mathematics Department at the University of California
          at Berkeley and in the Physics Division of the Lawrence
          Berkeley Laboratory.

Burton Smith, Tera Computer Company
          Burton Smith is Chairman and Chief Scientist of Tera
          Computer Company, a manufacturer of high performance
          computer systems.

Mary Vernon, Department of Computer Science, University of
Wisconsin
          Mary Vernon is a computer scientist who has received
          the NSF Presidential Young Investigator Award and the
          NSF Faculty Award for Women Scientists and Engineers in
          recognition of her research in parallel computer
          architectures and their performance.

Appendix B
              NSF AND HIGH PERFORMANCE COMPUTING:
                HISTORY AND ORIGIN OF THIS STUDY

Introduction

This report of the Blue Ribbon Panel on High Performance
Computing follows a number of separate, but related, activities
in this area by the NSF, the computational science community, and
the Federal Government in general acting in concert through the
Federal Coordinating Committee on Science, Engineering, and
Technology.  The Panel's findings and recommendations must be
viewed within this broad context of HPC.  This section provides a
description of the way in which the panel has conducted its work
and a brief overview of the preceding accomplishments which were
used as the starting point for the Panel's deliberations.

The Origin of the Present Panel and Charter

Following the renewal of four of the five NSF Supercomputer
Centers in 1990, the National Science Board (NSB) maintained an
interest in the Centers' operations and activities.  Given the
national scope of the Centers, and the possible implications for
them contained in the HPCC Act of 1992, the NSB commissioned the
formation of a blue ribbon panel to investigate the future
changes in the overall scientific environment due the rapid
advances occurring in the field of computers and scientific
computing.  The panel was instructed to investigate the way
science will be practiced in the next decade, and recommend an
appropriate role for NSF to enable research in the overall
computing environment of the future.  The panel consists of
representatives from a wide spectrum of the computer and
computational science communities in industry and academia. The
role expected of the Panel is reflected by its Charter :

A.   Assess the contributions of high performance computing to
     scientific and engineering research and education, including
     ancillary benefits, such as the stimulus to the pace of
     innovation in U.S. industries and the public sector.

B.   Project what hardware, software and communication resources
     may be available in the next five to ten years to further
     these advances and identify elements that may be
     particularly important to the development of HPC.

C.   Assess the variety of institutional forms through which
     access to high performance computing may be gained including
     funding of equipment acquisition, shared access through
     local centers, and shared access through broad band
     telecommunications.

D.   Project sources, other than NSF, for support of such
     capabilities, and potential cooperative relationships with:
     states, private sector, other federal agencies, and
     international programs.

E.   Identify barriers to the development of more efficient,
     usable, and powerful means for applying high performance
     computing, and means for overcoming them.

F.   Provide recommendations to help guide the development of
     NSFţs participation in supercomputing and its relation to
     the federal interagency High Performance Computing and
     Communications Program.

G.   Recommend policies and managerial structures needed to
     achieve NSF program goals, including clarification of the
     peer review procedures  and suggesting appropriate processes
     and mechanisms to assess program effectiveness necessary for
     insuring the highest quality science and engineering
     research.

At its first meeting in January 1993, the panel approved its
Charter, and established a scope of work which would allow a
final report to be presented to the NSB in Summer 1993.  A large
number of questions were raised amplifying the Charter's
directions.  Prior to its second meeting in March 1993 the Panel
solicited input from the national research community; a response
to the following four questions was requested.

*    How would you project the emerging high performance
     computing environment and market forces over the next five
     years and the implications for change in the way scientists
     and engineers will conduct R&D, design and production
     modeling?
*    What do you see as the largest barriers to the effective use
     of these emergent technologies by scientists and engineers
     and what efforts will be needed to remove these barriers?
     What is the proper role of government, and, in particular,
     the NSF to foster progress?
*    To what extent do you believe there is a future role for
     government-supported supercomputer centers?  What role
     should NSF play in this spectrum of capabilities?
*    To what extent should NSF use its resources to encourage use
     of high performance computing in commercial industrial
     applications through collaboration between high performance
     computing centers, academic users and industrial groups?

Over fifty responses were received and were considered and
discussed by the Panel at its March meeting.  The Panel also
received presentations, based on these questions, from vendors of
high performance computing equipment and representatives from
non-NSF supercomputer centers.

NSF's Early Participation in High Performance Computing

Although the National Science Foundation is now a major partner
in the nation's high performance computing effort, this was not
always the case.  In the early 1970s the NSF ceased its support
of campus computing centers, and by the mid-1970s there were no
"supercomputers" on any campus available to the academic
community.  Certainly computers of this capability were available
through other government agency (DoE and NASA) laboratories, but
NSF did not play a role, and hence many of its academic
researchers did not have the ability to perform computational
research on anything other than a departmental minicomputer,
thereby limiting the scope of their research.

This lack of NSF participation in the high performance computing
environment  began to be noted in the early 1980s with the
publication of a growing number of reports on the subject.  A
report to the NSF Division of Physics Advisory Committee in March
1981 entitled "Prospectus for Computational Physics", edited by
W. Press, identified a "crisis" in computational physics, and
recommended support for facilities.  Subsequent to this report a
joint agency study, "Large Scale Computing in Science and
Engineering", edited by P. Lax, appeared in December 1982 and
acted as the catalyst for NSF's reemergence in the support of
high performance computing.  The Lax Report presented four
recommendations for a government-wide program:

*    Increased access to regularly upgraded supercomputing
     facilities via high bandwidth networks

*    Increased research in computational mathematics, software,
and algorithms

*    Training of personnel in scientific computing

*     R&D of new supercomputer systems

The key suggestions contained in the Lax Report were studied by
an internal NSF working group, and the findings were issued in
July 1983 as "A National Computing Environment for Academic
Research", a report edited by M. Bardon and K. Curtis.  The
report studied NSF supported scientists' needs for academic
computing, and validated the conclusions of the Lax Report for
the NSF supported research community.  The findings of
Bardon/Curtis reformulated the four recommendations of the Lax
Report into a six point implementation plan for the NSF.  Part of
this action plan was a recommendation to establish ten academic
supercomputer centers.

The immediate NSF response was to set up a means for academic
researchers to have access, at existing sites, to the most
powerful computers of the day.  This was an interim step prior to
a solicitation for the formation of academic supercomputer
centers directly supported by the NSF.  By 1987, five NSF
Supercomputer Centers had been established, and all had completed
at least one year of operation.

During this phase the Centers were essentially isolated "islands
of supercomputing" whose role was to provide supercomputer access
to the academic community. This aspect of the Centers' activities
has changed considerably.  The NSF concept of the Centers'
activities was mandated to be much broader, as indicated by the
Center's original objectives:

*    Access to state of the art supercomputers

*    Training of computational scientists and engineers

*    Stimulate the U.S. supercomputer industry

*    Nurture computational science and engineering

*    Encourage collaboration among researchers in academia,
industry and government

In 1988-1989 NSF conducted a review to determine whether support
was justified beyond 1990.  In developing proposals, the Centers
were advised to increase their scope of responsibilities.
Quoting from the solicitation:

     "To insure the long term health and value of a supercomputer
     center, an intellectual environment, as well as first class
     service, is necessary.  Centers should identify an
     intellectual component and research agenda".

In 1989 NSF approved continuation through 1995 of the Cornell
Theory Center, the National Center for Supercomputing
Applications, the Pittsburgh Supercomputing Center, and the San
Diego Supercomputer Center.  Support for the John von Neumann
Center was not continued.

The Federal High Performance Computing and Communications
Initiative

At the same time the NSF Supercomputer Centers were beginning the
early phases of their operations the Federal Coordinating
Committee for Science, Engineering, and Technology began a study
in 1987 on the status and direction of high performance
computing, and its relationship to federal research and
development.  The results were "A Research and Development
Strategy for High Performance Computing" issued by the Office of
Science and Technology Policy (OSTP) in November 1987, followed
in September 1989 by another OSTP document "The Federal High
Performance Computing Program".   These two reports set the
framework for the inter-governmental agency cooperation on high
performance computing which led to the High Performance Computing
and Communications (HPCC) Act of 1991.

HPCC focuses on four integrated components+ of computer research
and applications which very closely echo the Lax Report
conclusions:

----------
+At the time of writing this Report, a fifth component, entitled,
Information Infrastructure, Technology, and Applications is being
defined for inclusion in the HPCC Program.


*    High Performance Computing Systems - technology development
     for scalable parallel systems to achieve teraflop speed

*    Advanced Software Technology and Algorithms - generic
     software and algorithm development to support Grand
     Challenge projects, including early access to production
     scalable systems

*    National Research and Education Network - to further develop
     the national network and networking tools, and to support
     the research and development of gigabit networks

*    Basic Research and Human Resources - to support individual
     investigator research on fundamental and novel science and
     to initiate activities to significantly increase the pool of
     trained personnel

With this common structure across all the participating agencies,
the Program outlines each agency's roles and responsibilities.
NSF is the lead agency in the National Research and Education
Network, and has major roles in Advanced Software Technology and
Algorithms, and in Basic Research and Human Resources.

The Sugar Report

After the renewal of the four NSF Supercomputer Centers the NSF
Division of Advanced Scientific Computing recognized that the
computing environment within the nation had changed considerably
from that which existed at the inception of the Centers Program.
The Division's Advisory Committee was asked to survey the future
possibilities for high performance computing, and report back to
the Division.  Two workshops were held in the Fall of 1991 and
Spring of 1992.  Thirty one participants with expertise in
computational science, computer science and the operation of
major supercomputer centers were involved.

The final report, edited by R. Sugar of the U. of California at
Santa Barbara, recommended future directions for the
Supercomputer Centers Program which would "enable it to take
advantage of these (HPCC) opportunities and to meet its
responsibilities to the national research community".  The
committee's recommendations can be summarized as:

*    Decisions and planning by the Division need to be made in a
     programmatic way, rather than on an individual Center by
     Center basis - the meta-center concept provides a vehicle
     for this management capability which goes beyond the
     existing Centers.

*    Access to stable computing platforms (currently vector
     supercomputers) needs to be augmented by access to state of
     the art technology (currently massively parallel computers)
     - but, the former cannot be sacrificed to provide the latter

*    The Supercomputer Centers can be focal points for enabling
     collaborative efforts across many communities -
     computational and computer science, private sector and
     academia, vendors and academia.

Appendix C

      TECHNOLOGY TRENDS and BARRIERS to FURTHER PROGRESS

BACKGROUND:

What is the state of the HPC industry here and abroad?  What is
its prognosis?

The high performance computer industry is in a state of turmoil,
excitement and opportunity. On the one hand, the vector
multiprocessors manufactured by many firms, large and small, have
continued to improve in capability over the years.  These systems
are now quite mature, as measured by the fact that delivered
performance is a significant fraction of the theoretical peak
performance of the hardware, and are still the preferred platform
for many computational scientists and engineers.  They are the
workhorses of high performance computing today and will continue
in that role even as alternatives mature.

On the other hand, dramatic improvements in microprocessor
performance and advances in microprocessor-based parallel
architectures have resulted in "massively parallel" systems that
offer the potential for high performance at lower cost./20 For
example, $10 million in 1993 buys over 40 gigaflops peak
processing power in a multicomputer but only 5 gigaflops in a
vector multiprocessor. As a result, increasing numbers of
computational scientists and engineers are turning to the highly
parallel systems manufactured by companies such as Cray Research
Inc.,  IBM, Intel,  Kendall Square, Thinking Machines Inc.,
MasPar, and nCUBE.

----------
20/Note that the higher cost of vector machines is partly caused
by their extensive use of static memory chips for main memory and
the interconnection networks they use for high shared-memory
bandwidth.  These attributes contribute to increased
programmability and the realization of a high fraction of peak
performance on user applications. Realized performance on MPP
machines is still uncertain. A comparison of today's vector
machines versus MPP systems based on realized performance per
dollar reveals much less difference in cost-performance than
comparisons based on peak performance.

Microprocessor performance has increased by 4X every three years,
matching the rate of integrated circuit logic density improvement
as predicted by Moore's law.  For example, the microprocessors of
1993 are around 200 times faster than those of 1981.  By
contrast, the clock rates of vector processors have improved much
more slowly; today's fastest vector processors are only five or
six times faster than 1976's Cray 1. Thus, the performance gap
between these two technologies is quickly disappearing in spite
of other performance improvements in vector processor
architecture.

Although microprocessor-based massively parallel systems hold
considerable promise for the future, they have not yet reached
maturity in terms of ease of programming and ability to deliver
high performance routinely to large classes of applications.
Unfortunately, the programming technology that has evolved for
the vector multiprocessors does not directly transfer to highly
parallel systems. New mechanisms must be devised for high
performance communication and coordination among the processors.
These mechanisms must be efficiently supported in the hardware
and effectively embodied in programming models.

Currently, vendors are providing a variety of systems based on
different approaches, each of which has the potential to evolve
into the method of choice. Vector multiprocessors support a
simple shared memory model which demands no particular attention
to data arrangement in memory. Many of the currently available
highly parallel architectures are based on the "multicomputer"
architecture which provides only a message-passing interface for
inter-processor communication. Emerging architectures, including
the Kendall Square KSR-1 and systems being developed by Convex,
Cray Research, and Silicon Graphics, have shared address spaces
with varying degrees of hardware support and different
refinements of the shared memory programming model. These
computers represent a compromise in that they offer much of the
programming simplicity of shared memory yet still (at least so
far) require careful data arrangement to achieve good
performance. (The data parallel language on the CM-5 has similar
properties.) A true shared memory parallel architecture, based on
mechanisms that hide memory access latency, is under development
at Tera Computer.

The size of the high performance computer market worldwide is
about $2 billion (excluding sales of the IBM add-on vector
hardware), with Cray Research accounting for roughly $800 million
of it.  IBM and Fujitsu are also significant contributors to this
total, but most companies engaged in this business have sales of
$100 million or less. Some companies engaged in high speed
computing have other, larger sources of revenue (IBM, Fujitsu,
Intel, NEC, Hitachi); other companies both large (Cray Research)
and small (Thinking Machines, Kendall Square, Meiko, Tera
Computer) are high performance computer manufacturers
exclusively.  There are certainly more companies in the business
than can possibly be successful, and no doubt there are new
competitors that will appear.  Helping to sustain this high level
of competitive innovation should be an important objective for
NSB policy in HPC.

FINDINGS

Where is the hardware going to be in 5 years? What will be the
performance and cost of the most powerful machines, the best
workstations, the mid-range computers?

The next five years will continue to see improvements in hardware
price/performance ratios.  Since microprocessor speeds now
closely approach those of vector processors, it is unclear
whether microprocessor performance improvement can maintain its
current pace. Still, as long as integrated circuits continue to
quadruple in density but only double in cost every three years we
can probably expect a fourfold price/performance improvement in
both processors and memory by 1998.  Estimating in constant 1993
dollars, the most powerful machines ($50 million) will have peak
performance of nearly a teraflop/21; mini-supercomputers ($1
million) will advertise 20 gigaflops peak performance;
workstations ($50,000) will approach 1 gigaflops, and personal
computers ($10,000) will approach 200 megaflops./22
----------
21/One teraflop is 1000 gigaflops or 10[superscript 12] floating
point instructions per second.

22/Spokesman from Intel, Convex and Silicon Graphics in
addressing the panel all made even higher estimates that this.


During this period, parallel architectures will continue to
emerge and evolve. Just as the CM-5 represented a convergence
between SIMD and MIMD parallel architectures and brought about a
generalization of the data-parallel programming model, it is
likely that the architectures will continue to converge and
better user-level programming models will continue to emerge.
These developments will improve software portability and reduce
the variety of architectures that are required for computational
science and engineering research, although there will likely
still be some diversity of approaches at the end of this 5-year
horizon. Questions that may be resolved by 1998 include:

*    Which varieties of shared memory architecture provide the
     most effective tradeoff between hardware simplicity, system
     performance, and programming convenience? and

*    What special synchronization mechanisms for processor
     coordination should be supported in the hardware?

Most current systems are evolving in these directions, and
answers to the issues will provide a more stable base for
software efforts. Furthermore, much of the current computer
science research in shared memory architectures is looking for
cost-effective hardware support that can be implemented in
multiprocessor workstations that are interconnected by general-
purpose local area networks. Thus, technology from high
performance parallel systems may be expected to migrate to
workstation networks, further improving the capabilities of these
systems to deliver high-performance computing to particular
applications.  It is possible that in the end the only
substantial difference between the supercomputers of tomorrow and
the workstation networks of tomorrow will be the installed
network bandwidth./23

----------
23/While parallel architectures mature, vector multiprocessors
will continue to evolve. Scaling to larger numbers of processors
ultimately involves solving the same issues as for the
microprocessor-based systems.


Where is the software/programmability going to be in 5 years?
What new programming models will emerge for the new technology?
How transparent will parallel computers be to users?

While the architectural issues are being resolved, parallel
languages and their compilers will need to continue to improve
the programmability of new high performance computer systems.
Implementations of "data parallel" language dialects like High
Performance Fortran, Fortran D, and High Performance C will
steadily improve in quality over the next five years and will
simplify programming of both multi-computers and shared address
systems for many applications. For the applications that are not
helped by these languages, new languages and programming models
will emerge, although at a slower pace. Despite strong efforts
addressing the problem from the language research community, the
general purpose parallel programming language is an elusive and
difficult quarry, especially if the existing Fortran software
base must be accommodated, because of difficulties with the
correct and efficient use of shared variables.

Support tools for software development have also been making
progress, with emphasis on visualization of a program's
communication and synchronization behavior. Vendors are
increasingly recognizing the need for sophisticated performance
tuning tools, with most now developing or beginning to develop
such tools for their machines. The increasing number of computer
scientists who are also using these tools could lead to even more
rapid improvement in the quality and usability of these support
tools.  Operating systems for high performance computers are
increasingly ill suited to the demands placed on them.
Virtualization of processors and memory often leads to poor
performance, whereas relatively fixed resource partitioning
produces inefficiency, especially when parallelism within the
application varies.  High performance I/O is another area of
shortfall in many systems, especially the multi-computers.
Research is needed in nearly every aspect of operating systems
for highly parallel computers.

What market forces or technology investments drive HPC
technologies and products?

Future high performance systems will continue to be built using
technologies and components built for the rest of the computer
industry.  Since integrated circuit fabrication facilities now
represent billion dollar capital investments, integrated circuits
benefit from very large scale economies; accordingly it has been
predicted that only mass-market microprocessors will prove to
have acceptable costs in future high performance systems.
Certainly current use of workstation microprocessors such as
Sparc, Alpha and the RS-6000 chips suggests this trend.  Even so
the cost of memory chips is likely to be a major factor in the
costs of massively parallel systems, which require massive
amounts of fast memory.  Thus the integrated circuit technology
available for both custom designs and industry standard
processors will increasingly be driven by the requirements of
much larger markets, including consumer electronics.

The health of the HPC vendors and the structure of their products
will be heavily influenced by demand from industrial customers.
Business applications represents the most rapidly growing market
for HPC products; they have much higher potential growth than
government or academic uses. Quite apart from NSF's obligation to
contribute to the nation's economic health through its research
activities, this fact motivates the importance of cooperation
with industry users in expanding HPC usage.  This reality means
that NSF should be attentive to the value of throughput as a
figure of merit in HPC systems (in contrast with turnaround time
which academic researchers usually favor), as well as the speed
with which large volumes of data can be accessed. Industry won't
put up with a stand-alone, idiosyncratic environment.


How practical will be the loose coupling of desk-top workstations
to aggregate their unused compute power?

Networks of workstations will become an important resource for
the many computations that perform well on them.  The probable
success of these loosely coupled system will inevitably raise the
standard for communication capabilities in the multicomputer
arena.  Many observers believe that competition from workstation
networks on one side and shared address space systems on the
other will drive multi-computers from the scene entirely; in any
event, the network bandwidth and latency of multi-computers must
improve to differentiate them from workstation networks.  Many
large institutions have 1000 or more workstations already
installed; the utilization rate of their processors on a 24 hour
basis is probably only a few percent.  An efficient way to use
the power of such heterogeneous networks would be more
financially attractive. It will, however, raise serious question
about security, control, virus-prevention, and accounting
programs.

Are there some emerging HPC technologies of interest other than
parallel processing? What is their significance?

Neural networks have recently become popular and have been
successfully applied to many pattern recognition and
classification problems.  Fuzzy logic has enjoyed an analogous
renaissance. Technologies of this sort are both interesting and
important in a broad engineering context and also are having
impact on computational science and engineering.  Machine
learning approaches, such as neural networks, are most
appropriate in scientific disciplines where there is insufficient
theory to support accurate computer modeling and simulation.

How important are simulation and visualization capabilities?

Simulation will play an ever increasing role in science  and
engineering.  Much of this work will be able to be carried out on
workstations or intermediate-scale systems, but it will  continue
to be appropriate to share the highest  performance systems (and
the expertise in using them) on a national scale, to accomplish
large simulations within human time scales. Smaller
configurations of these machines should be provided to individual
research universities for application software development and
research that involves modifying the operating system and/or
hardware.

Personal computer capabilities will improve, and visualization on
the desktop will become more routine.  Scientists and engineers
in increasing numbers will need to be equipped with visualization
capabilities.  The usefulness of high performance computing
relies on these systems because printed lists of numbers (or
printed sheaves of pictures, for that matter) are increasingly
unsatisfactory as an output medium, even for moderately sized
simulations.

BARRIERS TO CONTINUED RAPID PROGRESS

What software and/or hardware "inventions" are needed?  Who will
address meeting these needs?

The most important impediment to the use of new highly parallel
systems has been the difficulty of programming these machines and
the wide variation that exists in communication capabilities
across generations of machines as well as among the machines in a
given generation. Application software developers are
understandably reluctant to re-implement their large scale
production codes on multi-computers, when significant effort is
required to port the codes across parallel systems as they
evolve. In theory, any programming model can be implemented (by
appropriate compilers) on any machine.  However, the inefficiency
of certain models on certain architectures is so great as to
render them impractical./24   What is needed in high performance
computing is an architectural consensus and a simple model to
summarize and abstract the machine interface to allow compilers
to be ported more easily across systems, facilitating the
portability of application programs. Ideally, the consensus
interface should efficiently support existing programming models
(even the multi-computers have created their own dusty decks), as
well as more powerful models. Considerable research in the
computer science community is currently devoted to these issues.
It is unlikely that the diversity of programming models will
decrease within the next five years, but it is  likely that
models will become more portable.

----------
24/For example, it is not practical to implement data-parallel
compilers on the Intel iPSC/860.


How important will be access to data, data management?

Besides needing high performance I/O, some fields of
computational science need widely distributed access to data
bases that are extremely large and constantly growing.  The need
is particularly felt in the earth and planetary sciences,
although the requirements are also great in cellular biology,
high energy physics, and other disciplines.  Large scale storage
hierarchies and the software to manage them must be developed,
and means to distribute the data nationally and internationally
are also required.  Although this area of high performance
computing has been relatively neglected in the past, these
problems are now receiving significantly more attention.

ROLES FOR GOVERNMENT AGENCIES

What should government agencies (NSF, DoD, DoE) do to advance HPC
beyond today's state of the art?  What more might they be doing?


The National Science Foundation plays several critical roles in
advancing high performance computing. First, NSF's support of
basic research and human resources in all areas of science and
engineering (and particularly in mathematics, computer science
and engineering) has been responsible for many of the advances in
our ability to successfully tackle grand challenge problems. The
Supercomputer Centers and the NSFnet have been essential to the
growth of high performance computing as a basic paradigm in
science and engineering. These efforts have been successful and
should be continued.

However, NSF has done too little in supporting computational
engineering in the computer science community. For example, the
NSF Supercomputer Centers were slow in providing experimental
parallel computing facilities and are currently not responding
adequately to integrating emerging technologies from the computer
science community. Although this situation is gradually changing,
the pace of the change should be accelerated.

Many advances in high performance computer systems have been
funded and encouraged by the Advanced Research Projects Agency
(ARPA), the major supporter of large scale projects in computer
science and engineering research and development in the US.  ARPA
has been charged by Congress to champion "dual use" technology;
in so doing it is addressing many of the needs of computational
science and engineering, even in the mathematical software arena,
that are common to defense and commercial applications, and the
science that underlies both.

The Department of Energy has traditionally provided substantial
support to computer science and engineering research within its
national laboratories and at universities with strong impetus
being provided by national defense requirements and resources.
More recently, the focus has shifted to the high performance
computing and communications needs of the unclassified Energy
Research programs within DoE. The National Energy Research
Supercomputer Center (NERSC) and the Energy Sciences Network
(ESnet) provide production services similar to the NSF
supercomputer centers and the NSFnet. Under the DoE HPCC
component, "grand challenge" applications are supported at NERSC
and also at two High Performance Computing Research Centers
(HPCRCs) which offer selected access for grand challenge
applications to leading edge parallel computing machines. DoE
also sponsors a variety of graduate fellowships in the
computational sciences. The computational science infrastructure
and traditions of DoE remain sound; however, the ability of the
Department to advance the state-of-the-art in high performance
computing systems will be paced by its share of the funding
available through the Federal High Performance Computing
Initiative or through Defense conversion funds.

The Department of Commerce has not been a significant source of
funds for computer system research and development since the very
early days of the computer industry when the National Bureau of
Standards built one of the first digital computers.  NBS has been
an important factor in supporting standards development,
particularly for the Federal Information Processing Standards
issued by GSA.  The expanded role of the National Institute for
Standards and Technology (as NBS is now called) under the Clinton
administration may include this kind of activity, especially when
industrial participation is a desired component.

NASA is embarked on a number of projects of potential importance,
especially in the development of a shared data system for the
global climate change program, which will generate massive
amounts of data from the Earth Observing Satellite Program.

What is role of NSF computer science and applied mathematics
research program? Is it relevant to the availability of HPC
resources in a five year time span?

Investments in mathematics and computer science research provide
the foundation for attacking today's problems in high performance
computing and must continue. NSF continues to be the primary U.S.
source of funds for mathematics and computer science research
within the scope of what one or two investigators and several
graduate assistants can do. Many fundamental advances in
algorithms, programming languages, operating systems, and
computer architecture have been NSF funded. This mission has been
just as vital as ARPA's and is complementary to it.

Among the largest barriers to the effective use of emergent
computing technologies are parallel architectures from which it
is relatively easy to extract peak performance, system software
(operating systems, databases, compilers, programming models) to
take advantage of these architectures, parallel algorithms,
mathematical modeling, and efficient and high order numerical
techniques. These are core computational mathematics and computer
science/engineering research issues, many of which are best
tackled through NSF's traditional peer-reviewed model.  NSF
should increase its support of this work. Increased investment in
basic research and human resources in mathematics and computer
science/engineering could significantly accelerate the pace of
HPC technology development.  In addition, technology transfer can
be increased by supporting a new scale of research not currently
being funded by any agency:  small teams with annual budgets in
the $250K - $1M range. These projects were once supported by DoE,
and also indirectly by ARPA at the former "block grant"
universities./25  These enterprises are now generally too small
for ARPA and seem to be too big for current NSF budget levels in
computer science.  A project of this scale could develop and
release an innovative piece of software to the high performance
computing community at large, or build a modest hardware
prototype as a stepping stone to more significant funding. A
project of this scale would also allow multi-disciplinary
collaboration, either within mathematics and computer
science/engineering (architecture, operating systems, compilers,
algorithms). and related disciplines (astronomers or chemists
working with computer scientists and mathematicians interested in
innovative programming or architectural support for that problem
domain).

----------
25/ARPA made an enormous contribution to the maturing of computer
science as a discipline in U.S. universities by consistently
funding research at about $1 million per year at MIT, Carnegie
Mellon, Stanford University and Berkeley. At this level of
consistent support these universities could build up a critical
mass of faculty and trained the next generation of faculty
leadership for departments being set up at every substantial
research university.  This targeted investment played a role in
computer science not unlike what NSF has done in computational
science at the four Centers.


[NOTE:  APPENDIX D WITH FIGURES 1, 2, 3 & 4 IS NOT INCLUDED IN
THIS ELECTRONIC VERSION]


Appendix E

REVIEW AND PROSPECTUS
OF
COMPUTATIONAL and COMPUTER SCIENCE
AND
ENGINEERING

Personal Statements by Panel Members


Computational Mechanics and Structural Analysis
     by Theodore Belytschko

High performance computing has had a dramatic impact on
structural analysis and computational mechanics, with significant
benefits for various industries. The finite element method, which
was developed at aerospace companies such as Boeing in the late
1950's and subsequently at the universities, has become one of
the key tools for the mechanical design of almost all industrial
products, including aircraft, automobiles, power plants,
packaging, etc. The original applications of finite element
methods were primarily in linear analysis, which are useful for
determining the behavior of engineering products in normal
operating conditions. Most linear finite element analyses are
today performed on workstations , except for problems with the
order of 1 million unknowns.

Supercomputers are used primarily for nonlinear analysis, where
they replace prototype testing. One rapidly developing area has
been automobile crashworthiness analysis, where models of
automobiles are used to design for occupant safety and for
features such as accelerometer placement for a air bag
deployment. The models which are currently used are generally on
the order of 100,000 to 250,000 unknowns, and even on the latest
supercomputers such as the CRAY C90 require on the order of 10 to
20 hours of computer time. Nevertheless these models are still
often too coarse to permit true prediction and hence they must be
tuned by tests.

Such models have had a tremendous impact on reducing the design
time for automobiles, since they eliminate the need for building
numerous prototypes. Almost all major automobiles manufacturers
have undertaken extensive programs in crashworthiness simulations
by computer on high performance machines, and many manufacturers
have bought supercomputers almost expressly for crashworthiness
simulation.

Because of the increasing concern with safety among manufacturers
of many other products, nonlinear analysis are also emerging in
many other industries: the manufacturer of trucks and
construction equipment, where the product must be certified for
safety in various accidents such as overturning or impact due to
falling construction equipment; railroad car safety; the safety
of aircraft, where recently the FAA have undertaken programs to
simulate the response of aircraft to small weapons so that damage
from such explosives can be minimized. Techniques of this type
are also being used the analysis the safety of jet engines due to
bird impact, the containment of fragments in case of jet engine
failure, and bird impact on aircraft canopies. In several cases,
NSF Supercomputer Centers have introduced industry to the
potentials of this type of simulation. In all of these, highly
nonlinear analysis which require on the order of 10x floating
point operations for a simulation must be made: such simulations
even on today's supercomputers are still often so time consuming
that decisions cannot be reached fast enough. Therefore an urgent
need exists for increasing the speed with which such simulations
can be made.

Nonlinear finite element analysis is also becoming increasingly
important in the simulation manufacturing processes. For example,
tremendous improvements can be made in processes such as sheet
metal forming, extrusion, and machining processes if these are
carefully designed through nonlinear finite element simulation.
These simulations offer large cost reductions and reduce design
time. Also the design of materials can be improved if computers
are first used to examine how these materials fail and then to
design the material so that failure is either decreased or so
that the material fails in a less catastrophic manner. Such
simulations require great resolution, and at the tips of cracks
phenomenon at the atomic scale must be considered.

Most of the calculations mentioned above are not made with
sufficient resolution because of limitations in computational
power and speed. Also, important physical phenomena are omitted
for reasons of expediency, and their computational modeling is
not well understood. Therefore, the availability of more
computational power will increase our understanding of modeling
nonlinear structural response and provide industry with more
effective tools for design.


Cellular and Systemic Biology
     by Teresa Chay

HPC has made a great impact on a variety of biological
disciplines, such as physiology, biological macromolecules, and
genetics. I will discuss below three vital organs in our body
where our understanding has greatly benefitted from high-
performance comuting and will continue to do so in the future.

Computer Models For Vital Organs In Our Body

Although the heart, brain, and pancreas function differently in
our body (i.e., the heart circulates the blood, the brain stores
and transfers information, and the pancreas secretes vital
hormones such as insulin), the mechanisms underlying their
functioning are quite similar -  "excitable" cells that are
coupled electrically and chemically, forming a network.

Ion channels in the cell membranes are involved in information
transfer. The ion channels receive stimuli from neighboring cells
and from cells in other organs. Upon receiving stimuli some ion
channels open while others close. When these channels are open,
they pass ions into or out of the cells, creating an electrical
difference (membrane potential) between the outside and the
inside of the cell. Some of these ion channels are sensitive to
the voltage (i.e., membrane potential) and others are responsive
to chemical substances (e.g., neurotransmitters/ hormones).

Opening of ion channels creates the "action potential" which
spreads from cell to cell, either directly or via chemical
mediators. Electrical transmission and chemical transmission are
interdependent in that chemical substances can influence the
ionic currents and visa versa. For example, the arrival of the
action potential at a presynaptic terminal may cause a release of
chemical substances; in turn these chemicals can open/close the
ion channels in the postsynaptic cell.

Why is high-performance computing needed? How the signals are
passed from cell to cell is a nonlinear dynamical problem and can
be treated mathematically by solving simultaneous differential
equations. These equations involve voltage, conductance of the
ionic current, and concentration of those chemical substances
that influence conductances. Depending on the model, each network
can be represented by a set of several million differential
equations. The need for parallel processors is obvious --- the
organs process information just the same way as the most powerful
parallel supercomputers do. Since the mechanisms involved in
these three organs are essentially the same, algorithms developed
for one can be easily modified to solve for another.

Three specific areas in which high-performance computing is
central are cardiac research, neural networks, and insulin
secretion.  These are detailed below.

Cardiac Research

It would be a great benefit to cardiac research if a realistic
computer model of the heart, its valves and the nearby major
vessels were to become available. With such a model the general
public would be able to see how the heart generates its rhythm,
how this rhythm leads to contraction, and how the contraction
leads to blood circulation. Scientists, on the other hand, could
study normal and diseased heart function without the limitations
of using human and animal subjects. With future HPC and parallel
processing, it may be possible to build a model heart without
consuming too many hours of computer time.  As a step toward
achieving this goal, the scientists in computational cardiology
have thus far accomplished the following three objectives.

1. A computer model of blood flow in the heart: Researchers have
used supercomputers to design an artificial mitral valve (the
valve that controls blood flow between the left atrium and left
ventricle) which is less likely to induce clots. The computer
simulated mitral valve has been patented and licensed to a
corporation developing it for clinical use. With parallel
processors, this technique is now expanded in order to construct
a realistic three dimensional heart.

2. Constructing an accurate map of the electrical potential of
the heart surface (epicardium): Arrhythmia in the heart is caused
by a breakdown in the normal pattern of cardiac electrical
activity. Many arrhythmias occur because of an abnormal tissue
inside the heart. Bioengineers have been developing a technique
with which to obtain the epicardial potential map from the coarse
information of it that can be recorded on the surface of the body
via electrocardiogram. With such a map, clinicians can accurately
locate the problem tissue and remove it with a relatively simple
surgical procedure instead of with drastic open-heart surgery.

3. Controlling sudden cardiac death: Sudden cardiac death is
triggered by an extra heart beat. Such a beat is believed to
initiate spiral waves (i.e., reentrant arrhythmias) on the main
pumping muscle of the heart known as the ventricular myocardium.
With HPC it is possible to simulate how this part of the heart
can generate reentrant arrhythmia upon receiving a premature
pulse. Computer modelling of reentrant arrhythmia is very
important clinically since it can be used as a tool to predict
the onset of this type of deadly arrhythmia and find a means to
cure it by properly administrating antiarrhythmia drugs (instead
of actually carrying out experiments on animals).

Parallel computing and development of better software will soon
enable the researchers to extend their simulations to a more
realistic three-dimensional system which includes the detailed
geometry of ventrical muscle.

Neural Networks

Learning how the brain works is a grand challenge in science and
engineering. Artificial neural nets are based largely on their
connection patterns, but they have very simple processing
elements or nodes (either ON or OFF). That is, a simple network
consists of a layered network with an input layer (sensory
receptors), one or more "hidden" layers (representing the
interneurons which allow animals to have complex behaviors), and
an output layer (motor neurons). Each unit in a neural net
receives inputs, both excitatory and inhibitory, from a number of
other units and, if the strength of the signal exceeds a given
threshold, the "on-unit" sends signals to other units.

The real nervous system, however, is a complex organ that cannot
be viewed simply as an artificial neural net. Neural nets are not
hard-wired but are made of neurons which are connected by
synapses. There are at least 10 billion active neurons in the
brain. There are thousands of synapses per neuron, and hundreds
of active chemicals which can modify the properties of ion
channels in the membrane.

With HPC and massive parallel computing, neuroscientists are
moving into a new phase of investigation which focuses on
biological neural nets, incorporating features of real neurons
and the connectivity of real neural nets. Some of these models
are capable of simulating patterns of electrical activity, which
can be compared to actual neuronal activity observed in
experiments. With the biological neural nets, we begin to
understand the operation of the nervous system in terms of the
structure, function and synaptic connectivity of the individual
neurons.


Insulin Secretion

Insulin is secreted from the beta cells in the pancreas. To cure
diabetes it is essential to understand how beta cells release
insulin. The beta cells are located in a part of pancreas known
as the islet of Langerhans. In islets, beta cells are coupled by
a special channel (gap junctional channel) which connects one
cell to the next. Gap junctional channels allow small ions such
as calcium ions pass through from cell to cell. In the plasma of
beta cells, there are ion channels whose properties change when
the content of calcium ions changes. There are other types of
cells in the islet which secrete hormones. These hormones in turn
influence insulin secretion by altering the properties of the
receptors bound in the membrane of a beta cell. Thus the study on
how beta cells release insulin involves very complex non-linear
dynamics. With a supercomputer it is possible to construct a
model of the islet of Langerhans. With this model, researchers
would learn how beta cells release insulin in response to the
external signals such as glucose, neurotransmitters and hormones.
They would also learn the roles of other cell types in the islet
of Langerhans and how they influence the functional properties of
beta cells. A model in which beta cells function as a cluster has
been already constructed.


Material Science and Condensed Matter Physics
     by Gary S. Grest

The impact of high performance computing on material science and
condensed matter physics has been enormous. Major developments in
the sixties and seventies set the stage for the establishment of
computational material science as a third discipline, equal, yet
distinct from analytic theory and experiment. These developments
include the introduction of molecular dynamics and Monte Carlo
methods to simulate the properties of liquids and solids under a
variety of conditions. Density functional theory was developed to
model the electron-electron interactions and pseudopotential
methods to model the electron-ion interactions. These methods
were crucial in computing the electronic structure for a wide
variety of solids. Later, the development of path integral and
Green's function Monte Carlo methods allowed one to begin to
simulate quantum many-body problems. Quantum molecular dynamics
which combine well-established electronic methods based on local
density theory with molecular dynamics for atoms have recently
been introduced. On a more macroscopic scale, computational
mechanics which was discussed above by T. Belyschko  was
developed to study structural properties relations.

Current usage of high performance computing in material science
and condensed matter physics can be broadly classified as
Classical Many-Body and Statistical Mechanics, Electronic
Structure and Quantum Molecular Dynamics , and Quantum Many-Body,
which are discussed below.

Classical Many-Body and Statistical Mechanics

Classical statistical mechanics, where one treats a huge number
of atoms collectively date back
to Boltzmann and Gibbs. In these systems, quantum mechanics plays
only a subsidiary role. While it is needed to determine the
interaction between atoms, in practice these interactions are
often replaced by phenomenologically determined pairwise forces
between the atoms. This allows one to treat large ensembles of
atoms, by molecular dynamics and Monte Carlo methods. Successes
of this approach include insight into the properties of liquids,
phase transitions and critical phenomena, crystallization of
solids and compositional ordering in alloys. For systems where
one needs a quantitative comparison to experiment, embedded atom
methods have been developed in which empirically determined
functions are employed to evaluate the energy and forces.
Although the details of the electronic structure are lost, these
empirical methods have been successful in giving reasonable
descriptions of the physical processes in many systems in which
directional bonding is not important. Theoretical work in the
mid-70's on renormalization group methods, showed that a wide
variety of different kinds of phase transitions could be
classified  according to the symmetry of the order parameter and
the range of the interaction and did not depend on the details of
the interaction potential. This allowed one to use relatively
simply models, usually on a lattice, to study critical phenomena
and phase behavior.

While the basic computational techniques used in classical many-
body theory are now well established, there remain a large number
of important problems in material science which only be addressed
with these techniques. At present, with Cray YMP class computers,
one can typically handle thousands of atoms for hundreds of
picoseconds. With the next generator of massively parallel
machines, this can be extended to millions of particles for
microseconds. While not all problems require this large number of
particles or long times, many do. Problems which will benefit
from the faster computational speed typically involve either
large lengths and/or long time scales. Examples include polymers
and macromolecular liquids, where typical lengths scales of each
molecule can be hundreds angstroms and relaxation time scales
extend from microseconds and longer, liquids near their glass
transition where relaxation times diverge exponentially,
nucleation and phase separation which requires both large systems
and long times and effects of shear. Macromolecular liquids
typically contain objects of very different sizes. For example,
in most colloidal suspensions, the colloid particles are hundreds
of angstroms in size while the solvent is only a few angstroms.
At present, the solvent molecules must be treated as a continuum
background. While this allows one to study the static properties
of the system, the dynamics are incorrect. Faster computer will
allow us to study flocculation, sedimentation and the effects of
shear on order. Non-equilibrium molecular dynamics methods have
been developed to simulate particles under shear. However due to
the lack of adequate computation power, simulations at present
can only be carried out at unphysically high shear rates. Access
to HPC will enable one to understand the origins of shear
thinning and thickening in a variety of technologically important
systems. While molecular dynamics simulations are inherently
difficult to vectorize, recent efforts to run them on parallel
computers have been very encouraging, with increases in speed of
nearly a factor of 30 in comparison to the Cray YMP.

Monte Carlo simulations on a lattice remain a very powerful
computational technique. Simulations of this type have been very
successful in understanding critical phenomena, phase separation,
growth kinetics and disordered magnetic systems. Successes
include accurate determination of universal critical exponents,
both static and dynamic, and evidence for the existence of a
phase transition in spin glasses. Future work using massively
parallel computers will be essential to understand wetting and
surface critical exponents as well as systems with complex order
parameters. Direct numerical integration of a set of Langevin
equations that describe the nonlinear fluctuating hydrodynamics
can be solved in two-dimensions on a Cray YMP class supercomputer
but the extension to three dimensions requires HPC. Finally, cell
automata solutions of Navier-Stokes and Boltzmann equations are a
powerful method for studying hydrodynamics. All of these methods,
because of their inherent locality, run very efficiently on
parallel computers.

Electronic Structure and Quantum Molecular Dynamics

The ability of quantum mechanics to predict the total energy of a
system of electrons and nuclei enables ones to reap tremendous
benefits from quantum-mechanical calculations. Since many
physical properties can be related to the total energy of a
system or to differences in total energy, tremendous theoretical
effort has gone into developing accurate local density functional
total energy techniques. These methods have been very successful
in predicting with accuracy equilibrium constants, bulk moduli,
phonons, piezoelectric constants and phase-transition pressures
and temperatures for a variety of materials. These methods have
recently been applied to study the structural, vibrational,
mechanical and other ground state properties of systems
containing up to several hundred atoms. Some recent successes
include the unraveling of the normal state properties of high T_c
superconducting oxides, predictions of new phases of materials
under high pressure, predictions of superhard materials,
determination of the structure and properties of surfaces,
interfaces and clusters, and calculations of properties of
fullerenes and fullerites.

Particularly important are the developments of the past few years
which make it possible to carry out "first principles"
computations of complex atomic arrangements in materials starting
from nothing more than the identities of the atoms and the rules
of quantum mechanics. Recent developments in new iterative
diagonalization algorithms coupled with increases in the
computational efficiency of modern high performance computers
have made it possible to use quantum mechanical calculations of
the dynamics of systems in the solid, liquid and gaseous state.
The basic idea of these methods which are known as ab initio
methods is to minimize the total energy of the system by allowing
both the electronic and the ionic degrees of freedom to relax
towards equilibrium simultaneously. While ab initio methods have
been around for more than a decade, only recently have they been
applied to systems of more than a few atoms. Now, however, this
method can be used to model a few hundred atom system and this
number will increase by at least a factor of 10 within the next
five years. The method has already lead to new insights into the
structure of amorphous materials, finite temperature simulations
of the new C60 solid, computation of the atomic and electronic
structures of 7x7 reconstruction of Si(111) surface, melting of
carbon and studies of step geometries on semiconductor surfaces.
In the future, it will be possible to address many important
materials phenomena including phase transformations, grain
boundaries, dislocations, disorder and melting.

The problem of understanding and improving the methods of growth
of complicated materials, such as multi-component
heterostructures which are produced by epitaxial growth using
molecular beam or chemical vapor deposition techniques, stands
out as one very important technological application of this
method. Although a "brute-force" simulation of atomic deposition
on experimental time scales will not possible for sometime, one
can learn a great deal from studying the mechanisms of reactive
film growth.  Combining atomic calculations for the structure of
an interface with continuum theories of elasticity and plastic
deformation is also an important area for the future.

One of the most obvious areas for future applications are
biological systems, where key reaction
sequences would be simulated in ab initio fashion. These
calculations would not replace existing molecular mechanics
approaches, but rather supplement them in those areas where they
are not sufficiently reliable. This includes enzymatic reactions
involving transition metal centers and other multi-center bond-
reforming processes. A related area is catalysis, where the
various proposed reaction mechanisms could be explicitly
evaluated. Short-time finite temperature simulations can also be
explored to search for unforeseen reaction patterns. The
potential for new discoveries in these areas is high.

Important progress has also been made in understanding the
excitation properties of solids, in particular the predictions of
band offsets and optical properties. This requires the evaluation
of the electron self-energy and is computationally much heavier
than the local density approaches discussed above. This first
principles quasiparticle approach has allowed for the first time
the ab initio calculation of electron excitation energies in
solids valid for quantitative interpretation of spectroscopic
measurements. The excitation of systems as complex as C_60
fullerites have been computed.  Although the quasiparticle
calculations have yet to be implemented on massively parallel
machines, it is doable and the gain in efficiency and power is
expected to be similar to the ab initio molecular dynamics types
of calculations.

Much effort has been devoted in the past several years to
algorithm development to extend the applicability of these new
methods to ever larger systems. The ab initio molecular dynamics
have been successfully implemented on massively parallel machines
for systems as large as 700 atoms. Tight binding molecular
dynamics methods are an accurate, empirical way to include the
electronic degrees of freedom which are important for covalently
bonded materials, at speeds of 300 time faster than ab initio
methods. This method has already been used to simulate 2000 atoms
and with the new massively parallel machines, this number will
easily increase to 10,000 within a year. Another very exciting
recent development in this area is work on the so-called order N
methods for electronic structure calculations. At present,
quantum mechanical calculations scale at least as N^3 in the
large N limit, where N is the number of atoms in a unit cell.
Significant progress has been made recently by several groups in
developing methods which would scale as N. The success of these
approaches would further enhance our ability to study very large
molecular and materials systems including systems with perhaps
thousands of atoms in the near future.


Quantum Many-Body

The quantum many body problem lies at the core of understanding
many properties of materials. Over the last decade much of the
classical methodology discussed above has been extended into the
quantum regime in particular with the development of the path
integral and Green's function Monte Carlo methods. Early
calculations of the correlation energy of the electron gas are
extensively used in local density theory to estimate correlation
energy in solids. The low temperature properties of liquid and
solid helium, three and four, the simplest strongly correlated
many-body quantum system, are now well understood thanks in large
part to computer simulations. These quantum simulations have
required thousands of hours on Cray-YMP class computers.

While there still remain very difficult algorithmic issues, exact
fermion methods and quantum dynamical methods to name two, the
progress in the next decade should parallel the previous
developments in classical statistical mechanics. Computer
simulations of quantum many-body systems will become a ubiquitous
tool, integrated into theory and experiment. The software and
hardware has reached a state where much larger, complex and
realistic systems can be studied. Some particular examples are
electrons in transition metals, in restricted geometries, at high
temperatures and pressures or in strong magnetic fields.  Mean
field theory is unreliable in many of these situations. However,
these applications, if they are to become routine and widely
distributed in the materials science community will require high
performance hardware. Quantum simulations are naturally parallel
and are likely to be among the first applications using massively
parallel computers.

Thanks to J. Bernholc, D. Ceperley, J. Joannopoulous, B. Harmon
and S. Louie for their help in preparing this subsection.

Computational Molecular Biology/Chemistry/Biochemistry
     by Barry Honig

Background

There have been a number of revolutionary developments in
molecular biology that have greatly expanded the need for high
performance computing within the biological community. First,
there has been an exponential growth in the number of gene
sequences that have been determined, and no end is in sight.
Second, there has been a parallel (although slower) growth in the
number of proteins whose structures have been determined from x-
ray crystallography and, increasingly, from multidimensional NMR.
This literal explosion in new information has led to developments
in areas such as statistical analysis of gene sequences and of
three dimensional structural data, new databases for sequence and
structural information, molecular modeling of proteins and
nucleic acids, and three-dimensional pattern recognition. The
recognition of grand challenge problems such as protein folding
or drug design has resulted in large part from these
developments. Moreover, the continuing interest both in
sequencing the human genome and in the field of structural
biology guarantees that computational requirements will continue
to grow rapidly in the coming decade.

To illustrate the type of problems that can arise, consider the
case where a new gene has been isolated and its sequence is
known. In order to fully exploit this information it is necessary
to first obtain maximum information about the protein this gene
encodes. This can be accomplished by searching a nucleic acid
sequence data base for structurally or functionally related
proteins, and/or by detecting sequence patterns characteristic of
the three dimensional fold of a particular class of proteins.
There are numerous complexities that arise in such searches and
the computational demands imposed by the increasingly
sophisticated statistical techniques that are being used can be
imposing.

A variety of methods, all of them requiring vast computational
resources, are currently being applied to the protein folding
problem (predicting three dimensional structure from amino acid
sequence). Methods include statistical analyses (including neural
nets) of homologies to known structures, approaches based on
physical chemical principles and simplified lattice models of the
type used in polymer physics. A major problem in understanding
the physical principles of protein and nucleic conformation is
the treatment of the surrounding solvent. Molecular dynamics
techniques are widely used to model the solvent but their
accuracy depends on the potential functions that are used as well
as the number of solvent molecules that can be included in a
simulation. Thus, the technique is limited by the available
computational power. Continuum solvent models offer an
alternative approach but these too are highly computer intensive.


Even assuming a reliable method to evaluate free energies, the
problem of conformational search is daunting. There are a large
number of possible conformations available to a macromolecule and
it is necessary to develop methods, such as Monte Carlo
techniques with simulated annealing, to ensure that the correct
one has been included in the generated set of possibilities. A
similar set of problems arises, for example, in the problem of
structure-based drug design. In this case one may know the three-
dimensional structure of a protein and it is necessary to design
a molecule that binds tightly to a particular site on the
surface. Efficient conformational search, energy evaluation and
pattern recognition are requirements of this problem, all
requiring significant computational power.

Significant progress has been made in these and many other
related areas. Ten years ago most calculations were made without
including the effects of solvent. This situation has changed
dramatically due to scientific progress that has been potentiated
by the availability of significant computational power. Some of
this has been provided by Supercomputer centers while some has
been made available by increasingly powerful workstations. Fast
computers have also been crucial in the very process of three
dimensional structure determination. Both x-ray and NMR data
analysis have exploited methods such as molecular dynamics and
simulated annealing to yield atomic coordinates of
macromolecules. More generally, the new discipline of structural
biology, which involves the structure determination and analysis
of biological macromolecules, has been able to evolve due the
increased availability of high performance computing.

Future

Despite the enormous progress that has been made the field is
just beginning to take off. Gene sequence analysis will continue
to become more effective as the available data continue to grow
and as increasingly sophisticated data analysis techniques are
applied. It will be necessary to make state-of-the-art sequence
analysis available to individual investigators, presumably
through distributed workstations and through access to
centralized resources. This will require a significant training
effort as well as the development of user-friendly programs for
the biological community.

There is enormous potential in the area of three dimensional
structure analysis. There is certain to be major progress in
understanding the physical chemical basis of biological structure
and function. Improved energy functionals resulting from progress
in quantum mechanics will become available. Indeed a combination
of quantum mechanics and reaction field methods will make it
possible to obtain accurate descriptions of molecules in the
condensed phase. The impact of such work will be felt in
chemistry as well as in biology. Improved descriptions of the
solvent through a combination of continuum treatments and
detailed molecular dynamics simulations at the atomic level will
lead to truly level descriptions of the conformational free
energies and binding free energies of biological macromolecules.
When combined with sophisticated conformational search
techniques, simplified lattice models, and sophisticated
statistical techniques that identify sequence and structural
homologies, there is every reason to expect major progress on the
protein folding problem.

There will be parallel improvements in structure based design of
biologically active compounds such as pharmaceuticals. Moreover,
the development of new compounds based on biomimetic chemistry
and new materials based on polymer design principles deduced from
biomolecules should become a reality. All of this progress will
require increased access to high performance computing for the
reasons given above. The various simulation and conformational
search techniques will continue to benefit dramatically from
increased computational power. This will be true at the level of
individual workstations, which a bench chemist for example might
use to design a new drug. Work of this type often requires
sophisticated three dimensional graphics and will benefit from
progress in this area. Massively parallel machines which will
certainly be required for the most ambitious projects. Indeed, it
is likely that for some applications the need for raw computing
power will exceed what is available in the foreseeable future.

New developments in the areas covered in this section will have
major economic impact. The biotechnology, pharmaceutical and
general health industries are obvious beneficiaries but there
will be considerable spin-off in materials science as well.

Recommendations

*    Support the development of software in computational biology
     and chemistry. This should take the form of improved
     software and algorithms for workstations as well as the
     porting of existing programs and the development of new ones
     on massively parallel machines.

*    Make funds available for training that will exploit new
     technologies and for familiarizing biologists with existing
     technologies.

*    Funding should be divided between large centers, smaller
     centers involving a group of investigators at a few sites
     developing new technologies, and individual investigators.


Molecular Modeling and Quantum Chemistry
     by William Lester

The need for high performance computing has been met historically
by large vector supercomputers. It is generally agreed in the
computer and computational science communities that significance
improvements in computational efficiency will arise from
parallelism. The advent of conventional parallel computer systems
has generally required major computer code restructure to move
applications from vector serial computers to parallel
architectures. The move to parallelism has occurred in two forms:
distributed MIMD machines and clusters of workstations with the
former receiving the focus of attention in large multi-user
center facilities and the latter in local research installations.

The tremendous interest in the simulation of biological processes
at the molecular level using molecular mechanics and molecular
dynamics methods has led to continuing increase in demand of
computational power. Applications have potentially high practical
value and include, for example, the design of inhibitors for
enzymes that are suspected to play a role in disease states and
the effect of various carcinogens on the structure of DNA.

In the first case, one expects that a molecule designed to
conform within the three-dimensional arrangement of the enzyme
structure should be bound tightly to the enzyme in solution. This
requirement, and others, make it desirable to know the tertiary
structure of the enzyme. The use of computation for this purpose
is contributing significantly to the understanding of those
structures which are then used to guide organic synthesis.

In the second case, serial vector supercomputers typically can
carry out molecular dynamics simulations of DNA for time frames
of only picoseconds to nanoseconds. A recent calculation of 200-
ps involving 3542 water molecules and 16 sodium ions took 140
hours of Cray Y-MP time. Extending such calculations to the
millisecond or even the second range where important motions can
occur remains a major computational challenge that will require
the use of massively parallel computer systems.

Although in molecular mechanics or force field methods,
computational effort is dominated by the evaluation of the force
field that gives the potential energy as a function of internal
coordinates of the molecules and non-bonded interactions between
atoms, a popular approach for small organic molecules is the ab
initio Hartree-Fock (HF) method which has come into routine use
by organic and medicinal chemists to study compounds and drugs.
The HF method is ab initio because the calculation depends only
on basic information about the molecule, including the number and
types of atoms and the total change. The computational effort of
HF computations scales as N4, where N is the number of basis
functions used to describe the atoms of the molecule. Because the
HF method describes only the "average" behavior of electrons, it
typically provides a better description of relative geometries
than of energetics. The accurate treatment of the latter requires
proper account of the instantaneous correlated motions of
electrons which inherently is not described by the HF method.

For systems larger than those accessible with the HF methods, one
has, in addition to molecular mechanics methods, semiemperical
approaches. Their name arises out of the use of experimental data
to parameterize integrals and other simplifications of the HF
method leading to a reduction of computational effort to order
N3. Results of these methods can be informative for systems where
parametrization has been performed.

Recently, the density functional (DF) method has become popular,
overcoming deficiencies in accuracy for chemical applications
that limited earlier use. Improvements have come in the form of
better basis sets, advances in computational algorithms for
solving the DF equations, and the development of analytical
geometry optimization methods. The DF method is an ab initio
approach that takes into account electron correlation. In view of
the latter capability, it can be used to study a wide variety of
systems, including metals and inorganic species.

The move to parallel systems has turned out to be a major
undertaking in software development for the approaches described.
Serious impediments have been encountered in algorithm
modification for methods that go beyond the ab initio HF method,
and in steps to maximize efficiency with increased numbers of
processors. These circumstances have increased interest in
quantum Monte Carlo (QMC) methods for electronic structure. In
addition, QMC methods have been used with considerable success
for the calculation of vibrational eigenvalues, and in
statistical mechanics studies.

QMC, as used here in the context of electronic structure, is an
ab initio method for solving the Schroedinger equation
stochastically based on the formal similarity between the
Schroedinger equation and the classical diffusion equation. The
power of the method is that it is inherently an N-body method
that can capture all of the instantaneous correlation of the
electrons. The QMC method is readily ported to parallel computer
systems with orders of magnitude savings in computational effort
over serial vector supercomputers.

In the statistical mechanical studies of complex systems, one is
often interested in the spontaneous formation and energetics of
structure over large length scales. The mesoscopic structures,
such as vesicles and lamellars, formed from self assembly in
oil/water-surfactant mixtures are important examples. The
systematic analysis of these phenomena have only recently begun,
and computer simulation is one of the important tools in this
analysis. Due to the large length scales involved, simulation is
necessarily confined to very simple classes of models. Even so,
the work presses the capabilities of current computational
equipment to their limits. While the stability of various
nontrivial structures have been documented, we are still far from
understanding the rich phase diagram in such systems. The work of
Smit and coworkers using transputors demonstrates the utility of
parallelization in these simulations. Future equipment should
carry us much further towards understanding.

Competing interactions and concomitant "frustration"
characterizes complex fluids and the resulting mesoscopic
structures. Such competition is also a central feature of "random
polymers" - a model for proteins and also for manufactured
polymers. Computer simulation studies of random polymers can be
extremely useful. Though here too, the computations of even the
simplest models press the limits of current technology. To treat
this class of systems, Binder and coworkers and Frenkel and
coworkers have developed new algorithms, some of which are
manifestly parallelizable. Thus, this area is one where the new
computer technology should be very helpful.

Along with large length scale fluctuations, as in self assembly
and polymers, simulations press current computational equipment
where relaxation occurs over many orders of magnitude. This is
the phenomena of glasses. Here, the work of Fredrickson on the
spin-facilitated Ising system demonstrates the feasibility of
parallelization in studying long time relaxation and the glass
transition by simulation.

Polymers and glasses are examples pertinent to the understanding
and design of advanced materials. In addition, one needs to
understand the electronic and magnetic behavior of these and
other condensed matter systems. In recent years, a few methods
have appeared, especially the Car-Parrinello approach, which now
makes feasible the calculation of electronic properties of
complex materials. The calculations are intensive. For example,
studying the dynamics and electronic structure of a system with
only 64 atoms, periodically replicated, for only a picosecond is
at the limits of current capabilities. With parallelization, and
simplified models, one can imagine, however, significant progress
in our understanding of metalinsulator transitions and
localization in correlated disordered systems.


Mathematics and High Performance Computing
     by James Sethian

A. Introduction

Mathematics underlies much, if not all, of high performance
computing. At first glance, it might seem that mathematics, with
its emphasis on theorems and proofs, might have little to
contribute to solving large problems in the physical sciences and
engineering. On the contrary, in the same way that mathematics
contributes the underlying language for problems in the sciences,
engineering, discrete systems, etc., mathematical theory
underlies such factors as the design and understanding of
algorithms, error analysis, approximation accuracy, and optimal
execution. Mathematics plays a key role in the drive to produce
faster and more accurate algorithms which, in tandem with
hardware advances, produce state-of-the-art simulations across
the wide spectrum of the sciences.

At the same time, high performance computing provides a valuable
laboratory tool for many areas of theoretical mathematics such as
number theory and differential geometry. At the heart of most
simulations lies a mathematical model and an algorithmic
technique for approximating the solution to this model. Aspects
of such areas as approximation theory, functional analysis,
numerical analysis, probability theory, and the theory of
differential equations provide valuable tools for designing
effective algorithms, assessing their accuracy and stability, and
suggesting new techniques.

What is so fascinating about the intertwining of computing and
mathematics is that each invigorates the other. For example,
understanding of entropy properties of differential equations
have led to new methods for high resolution shock dynamics,
approximation theory using multipoles has led to fast methods for
N-body problems, methods from hyperbolic conservation laws and
differential geometry have produced exciting schemes for image
processing, parallel computing has spawned new schemes for
numerical linear algebra and multi-grid techniques, and methods
designed for tracking physical interfaces have launched new
theoretical investigations in differential geometry, to name just
a few. Along the way, this interrelation between mathematics and
computing has brought breakthroughs in such areas as material
science (such as new schemes for solidification and fracture
problems), computational fluid dynamics (e.g., high order
projection methods and sophisticated particle schemes),
computational physics (such as new schemes for Ising models and
percolation problems), environmental modeling (such as new
schemes for groundwater transport and pollutant modeling) and
combustion (e.g. new approximation models and algorithmic
techniques for flame chemistry/fluid mechanical interactions).

B. Current State

Mathematical research which contributes to high performance
computing exists across a wide range. On one end are individual
investigators or small, joint collaborations. In these settings,
the work takes a myriad of forms; brand-new algorithms are
invented which can save an order of magnitude speedup in computer
resources, existing techniques are analyzed for convergence
properties and accuracy, and model problems are posed which can
isolate particular phenomena. For example, such work includes
analysts working on fundamental aspects of the Navier-Stokes
equations and turbulence theory (c.f. the following section on
Computational Fluid Dynamics), applied mathematicians designing
new algorithms for model equations, discrete mathematicians
focussed on combinatorics problems, and numerical linear
algebraists working in optimization theory. At the other end are
mathematicians working in focussed teams on particular problems,
for example, in combustion, oil recovery, aerodynamics, material
science, computational fluid dynamics, operations research,
cryptography, and computational biology.

Institutionally, mathematical work in high performance computing
is undertaken at universities, the National Laboratories, the NSF
High Performance Computing Centers, NSF Mathematics Centers (such
as the Institute for Mathematical Analysis and the Mathematical
Sciences Research Institute, and the Institute for Advanced
Study), and across a spectrum of industries.   In recent years,
high performance computing has become a valuable tool for
understanding subtle aspects of theoretical mathematics. For
example, computing has revolutionized the ability to visualize
and evolve complex geometric surfaces, provided techniques to
untie knots, and helped compute algebraic structures.

C. Recommendations

Mathematical research is critical to ensure state-of-the-art
computational and algorithmic techniques which foster the
efficient use of national computing resources. In order to
promote this work, it is important that :

    1. Mathematicians be supported in their need to have access
to the most advanced computing systems available, both through
networks to supercomputer facilities, on-site experimental
machines (such as parallel processors), and individual
high-speed workstations.

    2. State-of-the-art research in modeling, new algorithms,
applied mathematics, numerical analysis, and associated
theoretical analysis be amply supported; it is this work that
continually and continuously rejuvenates computational
techniques. Without it, yesterday's algorithms will be running
on tomorrow's machines.

    3. Such research be supported on all levels; the individual
investigator, small joint collaborations, interdisciplinary
teams, and large projects.

    4. Funding be significantly increased in the above areas,
both to foster frontier research in computational techniques,
and to use computation as a bridge to bring mathematics and the
sciences closer together.


Computational Fluid Dyanmics
    by James Sethian

Introduction

The central goal of computational fluid dynamics (CFD) is to
follow the evolution of a fluid by solving the appropriate
equations of motion on a numerical computer. The fundamental
equations require that the mass, momentum, and energy of a
liquid/gas are conserved as the fluid moves. In all but the
simplest cases, these equations are too difficult to solve
mathematically, and instead one resorts to computer algorithms
which approximate the equations of motion. The yardstick of
success is how well the results of numerical simulation agree
with experiment in cases where careful laboratory experiments
can be established, and how well the simulations can predict
highly complex phenomena that cannot be isolated in the
laboratory.The effectiveness and versatility of a computational
fluid dynamics simulation rests on several factors. First, the
underlying model must adequately describe the essential physics.
Second, the algorithm must accurately approximate the equations
of motion. Third, the computer program must be constructed to
execute efficiently. And fourth, the computer equipment must be
fast enough and large enough to calculate the answers
sufficiently rapidly to be of use. Weaving these factors
together, so that answers are accurate, reliable, and obtained
with acceptable cost in an acceptable amount of time, is both an
art and a science. Current uses of computational fluid dynamics
range from analysis of basic research into fundamental physics
to commercial applications. While the boundaries are not sharp,
CFD work may be roughly categorized in three ways: Fundamental
Research, Applied Science, and Industrial Design and
Manufacturing

CFD and Fundamental Research

At many of the nation's research universities and national
laboratories, much of the focus of CFD work is on fundamental
research into fluid flow phenomena. The goal is to understand
the role that fluid motion plays in such areas as the evolution
of turbulence in the atmosphere and in the oceans, the birth and
evolution of galaxies, atmospherics on other planets, the
formation of polymers, the physiological fluid flow in the body,
and the interplay of fluid mechanics and material science such
as in the physics of superconductors. In these simulations,
often employing the most advanced and sophisticated algorithms,
the emphasis is on accurate solutions and basic insight. These
calculations are often among the most expensive of all CFD
simulations, requiring many hundreds of hours of computer time
on the most advanced machines available for a single simulation.
The modeling and algorithmic techniques for such problems are
constantly under revision and refinement. For the most part, the
major advances in new algorithmic tools, from schemes to handle
the associated numerical linear algebra to high order methods to
approximate difference equations, have their roots in basic
research into CFD applied to fundamental physics.

CFD and Applied Science

Here, the main emphasis is on the application of the tools of
computational fluid dynamics to problems motivated by specific
problems such as might occur in natural phenomena or physical
processes. Such work might include detailed studies of the
propagation of flames in engines or fire research in closed
rooms, the fluid mechanics involved in the dispersal of
pollutants or toxic groundwater transport, the hydraulic
response of a proposed heart valve, the development of severe
storms in the atmosphere, and the aerodynamic properties of a
proposed space shuttle design. For the most part, this work is
also carried out throughout the national laboratories and
universities with government support. While cost is not a major
issue in these investigations, the focus is more on obtaining
answers to directed questions. Less concerned with the algorithm
for its own sake, this work links basic research with commercial
CFD applications, and provides a stepping stone for advances in
algorithms to propagate into industrial sectors.

CFD and Industrial Design and Manufacturing

The focus in this stage of the process is on applying the tools
of computational fluid dynamics to solve problems that directly
relate to technology. A vast array of examples exist, such as
the development of a high-speed inkjet plotter, the action of
slurry beds for processing minerals, analysis of the aerodynamic
characteristics of an automobile or airplane, efficiency
analysis of an internal combustion engine, performance of high-
speed computer disk drives, and optimal pouring and packaging
techniques in manufacturing. For the most part, such work is
carried out in private industry, often with only informal ties
to academic and government scientists. Communication of new
ideas rests loosely on the influx of new employees trained in
the latest techniques, journal articles, and professional
conferences. A distinguishing characteristic of this work is its
emphasis on turn-around time and cost. Here, cost is not only
the cost of the equipment to perform the calculation, but the
people-years involved in developing the computer code, and the
time involved performing may hundreds of simulations as part of
a detailed parameter study. This motivation is quite different
from that in the other two areas. The need to perform a large
number of simulations under extremely general circumstances may
mean that a simple and fast technique that attacks only a highly
simplified version of the problem may be preferable to a highly
sophisticated and accurate technique that requires many orders
of magnitude more computational effort. This orientation is lies
at the heart of the applicability and suitability of
computational techniques to a competitive industry.

Future of CFD

* Modeling and Algorithmic Issues.
    In many ways, the research, applied science, and industry
agenda in CFD has changed, mostly in response to increased
computational power coupled to significant algorithmic and
theoretical/numerical advances. On the research side, up through
the early 1980's, the emphasis was on basic discretization
methods. In that setting, it was possible to develop methods
based on looking at simple problems in two, or even one space
dimension, in simple geometries, and in a fair degree of
isolation from the fluid dynamics applications. Over the last
five years, however, there has been a transition to the next
generation of problems. These problems are more difficult in
part because they are attempting more refined and detailed
simulations, necessitating finer grids and more computational
elements. However, a more fundamental issue is that these
problems are qualitatively different from those previously
considered. To begin, they often involve complex and less well-
understood physical models - chemically reacting fluid flows,
flows involving multiphase or multicomponent mixtures of fluids
or other complex constitutive behavior. They are often set in
three dimensions, in which both the solution geometry and the
boundary geometry are more complicated than in two dimensions.
Finally, they often involve resolving multiple length and time
scales, such as boundary and interior layers coming from small
diffusive terms, and intermittent large variations that arise in
fluid turbulence. Algorithmically, these require work in several
areas. Complex physical behavior makes it necessary to develop a
deeper understanding of the physics and modeling than had
previously been required. Additional physics requires the theory
and design of complex boundary conditions to couple the fluid
mechanics to the rest of the problem. Multiple length scales and
complex geometries lead to dynamically adaptive methods, since
memory and compute power are still insufficient to brute force
through most problems. For example, the wide variation in length
and time scales in turbulent combustion require a host of
iterative techniques, stiff ode solvers, and adaptive
techniques. Further algorithmic advances are required in areas
such as domain decomposition, grid partitioning, error
estimation, and preconditioners and iterative solvers for non-
symmetric, non-diagonally dominant matrices. Mesh generation,
while critical, has not progressed far.

    All in all, to accomplish in three-dimensional complex flow
what is now routine in two-dimensional basic flow will require
theory, numerics, and considerable cleverness. The net effect of
these developments is to make the buy-in to perform CFD research
much higher. Complex physical behavior makes it necessary to
become more involved with the physics modeling than had
previously been the case. The problems are sufficiently
difficult that one cannot blindly throw them at the computer and
overwhelm them. A considerable degree of mathematical,
numerical, and physical understanding must be obtained about the
problems in order to obtain efficient and accurate solution
techniques. On the industrial side, truly complex problems are
still out-of-reach. For example, in the aircraft industry, we
are still a long way from a full Navier-Stokes high Reynolds
number unsteady flow around a commercial aircraft.  Off in the
distance are problems of takeoff and landing, multiple wings in
close relation to each other, and flight recovery from sudden
changes in conditions. In the automotive industry, a solid
numerical simulation of the complete combustion cycle (as
opposed to a time-averaged transport model) is still many years
away. Other automotive CFD problems include analysis of coolant
flows, thermal heat transfer, plastic mold problems, and sheet
metal formation.

* High Performance Computing Issues.
    The computer needs to continue the next five years of CFD
work are substantial. As an example, an unforced Navier-Stokes
simulation might require 1000 cells in each of three space
dimensions. Presuming 100 flops per cell. and 25,000 time steps,
this yields 10[SUPERSCRIPT 9] x 10[SUPERSCRIPT 2] x 2.4 x
10[SUPERSCRIPT 4] = 2.5 x 10[SUPERSCRIPT 15] flops; a typical
compute time of 2 hours would then require a machine of 300
gigaflops. Adding forcing, combustion, or other physics severely
extends this calculation. As a related issue; memory
requirements pose an additional problem.  Ultimately, more
compute power and memory is needed. The promise of a single,
much faster, larger vector machine is not being made
convincingly, and CFD is attempting to adapt accordingly. Here
is an area where the dream of parallelism is both tantalizing
and frustrating. To begin, parallel computing has naturally
caused emphasis on issues related to processor allocation and
load balancing. To this end, communications cost accounting (as
opposed to simply accounting for floating point costs) has
become important in program design. For example, parallel
machines can often have poor cache memory management and a
limited number of paths to/from main memory; this can imply a
long memory fetch/store time, which can result in actual
computational speeds for real CFD problems far below the optimal
peek speed performance. To compensate, parallel computers tend
to be less memory efficient than vector machines, as space is
exchanged for communication time (duplicating data where
possible rather than sending it between processors). The move to
parallel machines is complicated by the fact that millions of
lines of CFD codes have been written in the serial/vector
format. The instability of the hardware platforms, the lack of a
standard global high performance Fortran and C, the lack of
complete libraries, and insecurity associated with a volatile
industry all contribute to the caution and reluctance of all but
the most advanced research practitioners of CFD.

Recommendations

In order to tackle the next generation of CFD problems, the
field will require:

*   Significant accessibility to the fastest current coarse-
    grained parallel machines.

*   Massively parallel machines with large memory, programmable
    under stable programming environments, including high
    performance Fortran and C, mathematical libraries,
    functioning I/O systems, and advanced visualization systems.

*   Algorithmic advances in adaptive meshing, grid generation,
    load balancing, and high order difference, element, and
    particle schemes.

*   Modeling and theoretical advances coupling fluid mechanics
    to other related physics.


High Performance Computing In Physics
    by James Sethian and Neal Lane

INTRODUCTION AND BACKGROUND

High Energy Physics

Two areas in which high performance computing plays a crucial
role are lattice gauge theory and the analysis of experimental
data. Lattice gauge theory addresses some of fundamental
theoretical problems in high energy physics, and is relevant to
experimental programs in high energy and nuclear physics. In the
standard model of high energy physics, the strong interactions
are described by quantum chromodynamics (QCD). In this theory
the forces are so strong that the fundamental entities, the
quarks and gluons, are not observed directly under ordinary
laboratory conditions. Instead one observes their bound states,
protons and neutrons, the basic constituents of the atomic
nucleus, and a host of short lived particles produced in high
energy accelerator collisions. One of the major objectives of
lattice gauge theory is to calculate the masses and other basic
properties of these strongly interacting particles from first
principles, and provide a test of QCD, as well as suggest that
the same tools could be used to calculate additional physical
quantities which may not be so well determined experimentally.
In addition, lattice gauge theory provides an avenue for making
first principle calculations of the effects of strong
interactions on weak interaction processes, and thus holds the
promise of providing crucial tests on the standard model at its
most vulnerable points. And, although quarks and gluons are not
directly observed in the laboratory, it is expected that at
extremely high temperatures one would find a new state of matter
consisting of a plasma of these particles. The questions being
addressed by lattice gauge theorists are the nature of the
transition between the lower temperature state of ordinary
matter and the high temperature quark-gluon plasma, the
temperature at which this transition occurs, and the properties
of the plasma. In the general area of experimental high energy
physics, there are three primary areas of computing:

1)  The processing of the raw data that is usually accumulated
at a central accelerator laboratory, such as the Wilson
Laboratory at Cornell.

2)  The simulation of physical processes of interest, and the
simulation of the behavior of the final states in detector.

3)  The analysis of the compressed data that results from
processing of the raw data and the simulations.

Atomic and Molecular Physics

In contrast to some areas of theoretical physics, the AM
theorist has the advantage that he understands the basic
equations governing the evolution of the system of particles
under consideration. However, the wealth of phenomena that
derive from the many-body interactions of the constituents and
their interactions with external probes such as electric and
magnetic fields are truly astounding. Computation now provides a
practical and useful alternative method to study these problems.
Most importantly, it is now possible to perform calculations
sophisticated enough to have a real impact on AM science. These
include high precision computations of the ground and excited
states of small molecules, scattering of electrons from atoms,
atomic ions and small polyatomic molecules, simple chemical
reactions involving atom-diatom collisions, photoionization and
photodissociation and various time dependent processes such as
multiphoton ionization and the interaction of atoms with ultra
strong ( or short ) electromagnetic fields.

Gravitational Physics

The computational goal of classical and astrophysical relativity
is the solution of the associated nonlinear partial differential
equations. For example, simulations have been performed of the
critical behavior in black hole formation using high accuracy
adaptive grid methods and which follow the collapse of spherical
scalar wave pulses over at least fourteen orders of magnitude,
of the structure we currently see in the universe (galaxies,
clusters of galaxies arranged in sheets, voids etc.) and how it
may have arisen through fluctuations generated during an
inflationary epoch, of head-on collisions of black holes, and of
horizon behavior in a number of black hole configurations.

CURRENT STATE

Definitive calculations in all of the areas mentioned above
would require significantly greater computing resources than
have been available up to now. Nevertheless steady progress has
been made during the last decade due to important improvements
in algorithms and calculational techniques, and very rapid
increases in available computing power. For example, among the
major achievements in lattice gauge theory have been: a
demonstration that quarks and gluons are confined at low
temperatures; steady improvements in spectrum calculations,
which have accelerated markedly in the last year; an estimate of
the transition temperature between the ordinary state of matter
and the quark-gluon plasma; a bound on the mass of the Higgs
boson; calculations of weak decay parameters including a
determination of the mixing parameter; and a determination of
the strong coupling constant at the energy scale of 5 GeV from a
study of the charmonium spectrum. Much of this work has been
carried out at the NSF Supercomputer Centers. In the area of
atomic and molecular physics, until quite recently most AM
theorists were required to compute using simplified models or,
if the research was computationally intensive, to use vector
supercomputers. This has changed with the widespread
availability of cheap, fast, Unix based, RISC workstations.
These "boxes" are now capable of performing at the 40-50
megaflop level and can have as much as 256 megabytes of memory.
In addition,it is possible to cluster these workstations and
distribute the computational task among the cpu's. There are a
few researchers in the US who have as many as twenty or thirty
of these workstations for their own group. This has enabled
computational experiments using a loosely coupled, parallel
model on selected problems in AM physics. However, this is not
the norm in AM theory. More typically our most computationally
intensive calculations are still performed on the large
mainframe, vector supercomputers available only to a limited
number of users. The majority of the AM researchers in the
country are still computing on single workstations or ( outdated
) mainframes of one kind or another.

FUTURE COMPUTING NEEDS

In the field of high energy physics, the processing of raw data
and the simulation of generic mixing processes are activities
that are well-suited to a centralized computer center. The
processing of the raw data should be done in a consistent,
organized, and reliable manner. Often, in the middle of the data
processing, special features in some data are discovered, and
these need to be treated quickly and in a manner consistent with
the entire sample. It is usual and logical that the processing
take place at the central accelerator center where the apparatus
sits, because the complete records of the data taking usually
reside there. In contrast, the high energy physics work in
simulation of specific processes and analysis of compressed data
are well matched to individual University groups.In terms of
computing needs, all of the above are well served either by a
large, powerful, central computer system, or by a cluster or
farm of workstations. For example, CERN does the majority of its
computing on central computers, while the Wilson Lab has a farm
of DECstation 5000/240's to process its raw data. High energy
computing usually proceeds one event at a time; each event,
whether from simulation or raw data, can be dealt to an
individual workstation for processing. There is no need to put
the entire resources of a supercomputer on one event. However,
massively parallel computers have the potential to handle large
numbers of events simultaneously in an efficient manner. These
machines will also be of importance in the work on lattice gauge
theory. Conversely, university groups are well served by the
powerful workstations now available, which has freed them from
dependence on the central laboratory to study the physics
signals of interest to them. These groups need both fast CPUs,
for simulation of data, as well as relatively large and fast
disk farms, for repeated processing of the compressed data. In
the field of atomic and molecular modeling, the future lies in
the use of massively parallel, scaleable multicomputers. AM
theorists, with rare exceptions, have not been as active as
other disciplines in moving to these platforms. This is to be
contrasted with the quantum chemists, the lattice QCD theorists
and many materials scientists who are becoming active users of
these computers. The lack of portability of typical AM codes and
the need to expend lots of time and effort in rewriting or
rethinking algorithms has prevented a mass migration to these
platforms.


Accomplishments in Computer Science and Engineering since the
Lax Report
    by Mary K. Vernon

While vector multiprocessors have been the workhorses for many
fields of computational science and engineering over the past
ten years, research in computer science and engineering has been
focused both on improving the capabilities of these systems, and
on developing the next generation of high-performance computing
systems -- namely the scalable, highly parallel computers which
have recently been commercially realized as systems such as the
Intel Paragon, the Kendall Square Research KSR-1, and the
Thinking Machines Corporation CM-5.

A variety of factors make scalable, highly parallel computers
the only viable way to achieve the teraflop capability required
by Grand Challenge applications. These systems represent far
more than an evolutionary step from their modestly parallel
vector predecessors. Realizing the teraflop potential of
massively parallel systems requires advances in a broad range of
computer science and engineering subareas, including VLSI,
computer architecture, operating systems, runtime systems,
compilers, programming languages, and algorithms. The
development of the new capabilities in turn requires
computationally intensive experimentation and/or simulations
that have been carried out on experimental prototypes (e.g., the
NYU Ultracomputer), early commercial parallel machines (such as
the BBN Butterfly or the Intel iPSC/2), and more recently on
high-performance workstations as well as the emerging massively
parallel systems such as the Thinking Machines CM5 and the Intel
Paragon.

Computer science and engineering researchers have made
tremendous progress in the past ten years in the development of
high performance computing technology, including the development
of ALL of the major technologies in massively parallel systems.
Among the specific accomplishments are:

*   development of RISC processor technology and the compiler
    technology for RISC processors, which is used in all high
    performance workstations as well as in the massively
    parallel machines.

*   development of computer-aided tools to facilitate the
    design, testing, and fabrication of complex digital systems,
    and their constituent components.

*   invention of the multicomputer and development of the
    message-passing programming paradigm which is used in many
    of today's massively parallel systems.

*   refinement of shared memory architectures and the shared
    memory programming model which is used in the KSR-1, the
    Cray T3D, and other emerging massively parallel machines.

*   invention of the hypercube interconnection network and
    refinement of this network to lower-dimensional, 2-d and 3-
    d, mesh networks that are currently used in the Intel
    Paragon and the Cray T3D

*   invention of the fat-tree interconnection network which is
    currently used in the Thinking Machines CM-5.

*   refinement of the SIMD architecture which is used for
    example in the Thinking Machines CM2 and the MasPar/1.

*   invention and refinement of the SPMD and data parallel
    programming models which are supported in several massively
    parallel systems.

*   development of the technology underlying the mature
    compilers for vector machines (i.e., compilers that give
    delivered performance that is a substantial fraction of the
    theoretical peak performance of these machines.)

*   development of the technology underlying all of the existing
    compilers for parallel machines,

*   development of the Mach operating system, which provided the
    basis for the OSF/1 standard used, for example, in the Intel
    Paragon.

*   development of light-weight and wait-free synchronization
    primitives

*   development of performance debugging tools

*   development of high-performance database technologies,
    including both algorithms and architectures that have
    influenced emerging systems, for example from NCR, Teradata,
    and IBM.

*   development of parallel algorithms for high-performance
    optimization

*   development of parallel algorithms for numerical linear
    algebra

*   development of machine learning technology for computational
    biology

In other words, key hardware, system software, and algorithm
technologies are directly the result of computer science and
engineering research across a broad range of subdisciplines.
Much of this work has been highly experimental, and has made
extensive use of current-generation, early commercial, and
prototype high performance systems. For example, simulations of
next-generation architectures, multi-user database systems, and
the like, as well as the development and testing of new
algorithms for large-scale optimization, numerical linear
algebra, computational biology, and the like, often require days
of simulation time on the most advanced platforms available.

Research efforts today are focused on improving the
capabilities, performance, and ease of use of parallel machine
technology, including the capabilities of workstation networks.
Experiments to evaluate the technology for next-generation
systems, like many other applications that would be classified
as "computational engineering", require the highest performance
systems available. In addition, simulation and/or testing of
innovations in computer architecture or operating systems
sometimes involve modifications to the host hardware and/or
operating software. These modifications can be developed and
debugged on medium-scale versions of the high-end systems.
Support for such initial development, as well as porting working
modifications to larger-scale systems for further test, is
critical to the rapid development of new HPC technologies.