High Performance System Acquisition: Building a More Inclusive Computing Environment for Science and Engineering
The NSF's vision for Advanced Computing Infrastructure (ACI), which is part of its Cyberinfrastructure for 21st Century Science and Engineering (CIF21), focuses specifically on ensuring that the science and engineering community has ready access to the advanced computational and data-driven capabilities required to tackle the most complex problems and issues facing today's scientific and educational communities. To accomplish these goals requires advanced computational capabilities within the context of a multilevel comprehensive and innovative infrastructure that benefits all fields of science and engineering. Previous solicitations have concentrated on enabling petascale capability through the deployment and support of a world-class High Performance Computing (HPC) environment. In the past decade the NSF has provided the open science and engineering community with a number of state-of-the art HPC assets ranging from loosely coupled clusters, to large scale instruments with many thousands of computing cores communicating via fast interconnects, and more recently with diverse heterogeneous architectures. Recent developments in computational science have begun to focus on complex, dynamic and diverse workflows. Some of these involve applications that are extremely data intensive and may not be dominated by floating point operation speed. While a number of the earlier acquisitions have addressed a subset of these issues, the current solicitation emphasizes this even further.
Currently the NSF operates, through Blue Waters and the eXtreme Digital (XD) program, a two-tiered comprehensive distributed Cyberinfrastructure (CI), and one of the largest and most powerful in the world. Through these and related projects the open science and engineering community is currently capable of tackling many of the most challenging scientific problems across multiple science and engineering domains. Both of these tiers are explicitly designed to address needs beyond the campus level. With this solicitation, NSF intends to continue this model to broaden the CI capabilities above the campus level. The resources funded under this solicitation will be incorporated into and allocated as part of the XD tier of national shared resources. The XD tier currently consist of;
- The Extreme Science and Engineering Discovery Environment ( XSEDE ) - Responsible for integration of XD tier shared resources and services
- Technical Insertion Service - Evaluates and makes recommendations on insertion of software and other technologies into the XD environment
- Technical Audit Service - Provides metrics on XD systems and operates XDMoD a publically available and easily usable tool for extracting data and monitoring XD systems
- Two visualization resources, Longhorn ( TACC ) and RDAV (NICS/University of Tennessee)
The current solicitation requests innovative proposal of two types:
The first is intended to complement previous NSF investments in advanced computational infrastructure. Consistent with the ACI Strategic Plan, the current solicitation is focused on expanding the use of high end resources to a much larger and more diverse community. To quote from the ACI Strategic Plan, the goal is to "...position and support the entire spectrum of NSF-funded communities "....and to promote a more comprehensive and balanced portfolio .... to support multidisciplinary computational and data-enabled science and engineering that in turn supports the entire scientific, engineering and educational community". Thus, while continuing to provide essential and needed resources to the more traditional users of HPC, it is important to enlarge the horizon to include research communities that are not users of traditional HPC systems, but who would benefit from advanced computational capabilities at the national level. Building, testing, and deploying these resources within the collaborative ecosystem that encompasses national, regional and campus resources continues to remain a high priority for the NSF and one of increasing importance to the science and engineering community.
The second type is devoted to the increasing pressure on the existing infrastructure to store and process very large amounts of data coming from simulation and from experimental resources such as telescopes, genome data banks or sensors. As recently stated in BIGDATA (NSF 12-499), "Pervasive sensing and computing across natural, built, and social environments is generating heterogeneous data at unprecedented scale and complexity. Today, scientists, biomedical researchers, engineers, educators, citizens and decision-makers live in an era of observation: data come from many disparate sources, such as sensor networks; scientific instruments, such as medical equipment, telescopes, colliders, satellites, environmental networks, and scanners; video, audio, and click streams; financial transaction data; email, weblogs, twitter feeds, and picture archives; spatial graphs and maps; and scientific simulations and models. This plethora of data sources has given rise to a phenomenal diversity in data types; data can be temporal, spatial, or dynamic and can be derived from both structured and unstructured sources. Data may have different representation types, media formats, and levels of granularity, and may be used across multiple scientific disciplines. These new sources of data and their increasing complexity contribute to an explosion of information."
A critical aspect of the cyberinfrastructure required to deal with this data deluge is that the data must be rapidly available to researchers geographically separated from where those data resources are located. In addition, it is important to ensure that the data is secure. To address this need, the current solicitation is designed to complement these other solicitations and calls for the acquisition and support of a large scale instrument that will have the capability of storing, sustainably accessing, analyzing, disseminating, securing and migrating data across the NSF cyberinfrastructure. The data may come from scientific computation, from scientific instruments/sensors or other sources but once it is generated it often needs to be available to the scientific community independent of where they are located. One final point: this solicitation is not designed to address very long term archival storage issues but proposals that can inform future policy on this issue say, via, some use cases, are certainly welcome.
Service Providers - those organizations willing to acquire, deploy and operate ACI resources in service to the science and engineering research and education community - play a key role in the provision and support of a national Cyberinfrasructure. With this solicitation, the NSF requests proposals from organizations willing to serve as Service Providers within the eXtreme Digital (XD) program who propose to acquire and deploy new, innovative systems features and services to the science & engineering community using the shared services model of the XSEDE project.
Note that proposals to add new and innovative features to currently deployed systems are eligible for consideration provided they are consistent with the goals of the current solicitation.
Competitive proposals should address one or more of the following:
- Complement existing XD capabilities with new types of computational resources attuned to less traditional computational science communities;
- Incorporate innovative and reliable services within the HPC environment to deal with complex and dynamic workflows that contribute significantly to the advancement of science and are difficult to achieve within XD;
- Facilitate transition from local to national environments via the use of virtual machines;
- Introduce highly useable and cost efficient cloud computing capabilities into XD to meet national scale requirements for new modes of computationally intensive scientific research;
- Expand the range of data intensive and/or computationally-challenging science and engineering applications that can be tackled with current XD resources;
- Provide reliable approaches to scientific communities needing a high-throughput capability:
- Provide a useful interactive environment for users needing to develop and debug codes using hundreds of cores or for scientific workflows/gateways requiring highly responsive computation;
- Deal effectively with scientific applications needing a few hundred to a few thousand cores;
- Efficiently provide a high degree of stability and usability by January, 2015
In past solicitations benchmarks have played an important role. Two types of benchmarks were required: NSF provided and proposer selected benchmarks. For this solicitation, the NSF has opted not to require a specific set of NSF provided benchmarks. One reason for this decision is that the current solicitation is not focused on funding a single, large resource designed to serve tightly coupled scientific applications dominated by floating point operations and needing many thousands of cores. While this is still important in certain contexts, the present call is much broader. As such, we expect that each proposer will provide a convincing demonstration, with hard data, that their system will perform as described in their proposal. The demonstration certainly can address applications that are used by the NSF computational science community but should provide compelling evidence of the expanded scientific diversity resulting from innovative aspects of the proposed resource. Clearly the details of the submitted benchmark results will depend on the nature of the proposed resource and is likely to differ from one submission to the next.
Frequently Asked Questions (FAQs) for NSF 13-528.
THIS PROGRAM IS PART OF
Division of Advanced Cyberinfrastructure Programs
What Has Been Funded (Recent Awards Made Through This Program, with Abstracts)
Map of Recent Awards Made Through This Program