Key Survey Information
Data Collection and Processing
Survey Quality Measures
Data Availability and Comparability
1. Survey Overview (2013 survey cycle)
- Purpose: The Survey of Science and Engineering Research Facilities is a congressionally mandated survey. It is the primary source of information on the amount and cost of space at science and engineering research facilities located at U.S. research-performing colleges and universities. The survey also collects information on computing and networking capacity for research and instructional activities. The survey is the basis of public data used by Congress, higher education associations, state governments, academia, and architectural and engineering firms.
- Data collection authority: The information is solicited under the authority of the National Science Foundation Act of 1950.
- Major changes to recent survey cycle: The following changes were made to Part 2 of the FY 2013 survey on Computing and Networking Capacity (for research and instructional activities).
- Question 1 on total bandwidth was modified to allow for more precise reporting of bandwidth.
- Question 2 on bandwidth through consortia was modified to clarify that Internet2 and National LambdaRail should not be considered consortia for the purposes of this question.
- Question 3 on connections to Internet2 and National LambdaRail was added.
- Question 4 on dark fiber was modified to clarify that indefeasible rights of use should be reported and to capture ownership of dark fiber for the next fiscal year.
- Questions 5 through 11 on centrally administered high-performance computing (HPC) was modified to include only systems that are 10 teraflops or faster.
- Question 5 on architectures for centrally administered HPC was modified to drop the restriction that only systems that are "generally available to the campus community" should be considered. In addition, the category for special purpose architectures has been dropped; such architectures can be reported as "other architecture."
- Question 7 on centrally administered HPC systems was added.
- Questions 9 through 11 on storage for centrally administered HPC were modified to allow for more precise reporting of storage capacity.
- Question 12 on archival storage from external cloud services was added.
- Question 13 on research computing from external cloud services was added.
- Twelve questions from the last survey cycle were deleted (question numbers shown below refer to those appearing in the FY 2011 survey):
- Internet2 bandwidth (Question 2)
- National LambdaRail bandwidth (Question 3)
- Federal government research network connections (Question 4)
- Desktop port connections (Question 6)
- Speed on your network (Question 8)
- Wireless connections (Question 9)
- Comments on networking (Question 10)
- Centrally administered clusters of 1 teraflop or faster (Question 13)
- Centrally administered MPP of 1 teraflop or faster (Question 14)
- Centrally administered SMP of 1 teraflop or faster (Question 15)
- Centrally administered experimental/emerging computing systems of 1 teraflop or faster (Question 16)
- Centrally administered special purpose computing systems of 1 teraflop or faster (Question 17)
2. Key Survey Information
- Frequency of the data collection: Biennial.
- Initial year of survey: 1986.
- Reference period: FY 2013.
- Response unit: Establishments. U.S. academic institutions reporting at least $1 million in R&D in the National Science Foundation's (NSF's) Higher Education Research and Development (HERD) survey.
- Sample or census: Census.
- Population size: Approximately 600.
- Sample size: Not applicable.
- Key variables: Key variables of interest are listed below.
- Amount and type of science and engineering research space
- Current expenditures for projects to construct and to repair and renovate research facilities
- Condition of research facilities
- Planned construction and repair and renovation of research facilities
- Source of funds (federal, state and local, institutional) for construction and for repair and renovation of research facilities
- Research animal facilities
- Bandwidth speeds and high performance network connections
- Dark fiber
- Data storage capabilities
- High performance computing
3. Survey Design
- Target population: Research-performing colleges and universities in the United States that expended at least $1 million in research and development funds in the prior fiscal year are the target population for this survey.
- Sample frame: The frame for the academic institutions is the FY 2012 NSF HERD Survey. In the FY 2013 survey cycle, there were 588 academic institutions, of which 581 (99%) responded.
- Sample design: The survey is a census of all eligible institutions, as defined above.
4. Data Collection and Processing
- Data collection: The FY 2013 survey was conducted by Westat under contract to the National Center for Science and Engineering Statistics (NCSES). Surveys were distributed to institutional coordinators at each institution. Institutional coordinators are individuals knowledgeable about the requested information who collect the responses from various offices and submit the information. The data collection period was from October 2013 through April 2014.
Respondents could choose to respond to the survey through printing an Adobe PDF questionnaire from the Web and submitting a paper survey or using the Web-based data collection system. For both methods, telephone and e-mail follow-up was used.
- Data processing: Several procedures were used to clean and edit the data. The Web survey contained numerous programmed edit checks that alerted the respondents to inconsistent or missing data using edit messages, such as alerting respondents if their individual data did not sum to their total data. Once respondents submitted their final data, a second set of edit checks were conducted. Finally, comparisons were made between an institution's FY 2013 data and the previous year's survey data. Respondents were contacted regarding any inconsistent, missing, or unclear data.
- Estimation techniques: This survey is a census. Imputation was performed for missing items from nonresponding institutions in order to make population estimates.
Data missing as a result of item nonresponse were imputed using a regression-model approach with predictors: (1) private or public, (2) doctorate granting or nondoctorate granting, (3) existence of a medical school, (4) R&D expenditures for the prior fiscal year, and (5) total net assignable square feet (NASF) for the prior fiscal year.
5. Survey Quality Measures
- Sampling error: This survey is a census, so no sampling error exists.
- Coverage error: Coverage is high because institutions meeting the population requirements can be easily identified. However, it is possible that some institutions may be inadvertently excluded. Institutions were investigated to ensure there was no duplication.
- Non-response error:
- Unit nonresponse—For the FY 2013 cycle, 99% (581 out of 588) of the academic institutions responded to the survey.
- Item nonresponse—The FY 2013 survey had limited item nonresponse.
- Nonresponse ranged from 0% to 2% for 99% of the items. Five items had nonresponse rates of 3% to 4%.
- Measurement error: The most likely source of measurement error results from institutions estimating the requested data. Respondents may estimate their data for several reasons, such as data that are not included in the institution's database or because some figures are estimates by nature (e.g., out-year budget figures).
Measurement error may also occur because institutions may define their database elements differently from the definitions used in the survey. For example, an institutional database may identify research space based on a primary-use criterion, whereas the survey requests that space be prorated according to all uses. Finally, the survey question on the condition of research space is a subjective question that may be subject to measurement error.
6. Data Availability and Comparability
- Data availability: Survey data are compiled for the defined fiscal year, the preceding fiscal year and planned activities for two succeeding fiscal years.
- Data comparability: This survey was first conducted in 1986. Small improvements were made to the survey questions over time, but these changes do not appear to affect data comparability. The FY 2001 survey was very limited and comprised of only two questions that should be comparable to the corresponding questions in the prior survey cycles.
The survey was extensively redesigned for implementation in the FY 2003 survey. A comprehensive description of the redesigned survey can be found in Redesign of Survey of Science and Engineering Research Facilities: 2003. To the extent possible, the FY 2003 survey was redesigned for comparability over time.
Questions were added on computing and networking capacity beginning with the FY 2003 survey cycle. Following each survey cycle, the computing and networking capacity questions in Part 2 of the survey are evaluated for current relevance and updates in technology. As a result, new questions may be added, some questions may be deleted, and other questions may be modified. Computing and networking capacity data appropriate for longitudinal comparisons are published in the detailed statistical tables for each survey cycle.
Beginning with the FY 2003 cycle, respondents are requested to provide data on their institution's individual new construction projects. Respondents provide several types of data for each project including name, gross square feet, net assignable square feet, and cost of project. Using this information, it is possible to compare the new construction projects reported by each institution in the immediately previous survey cycle to the projects the same institution reported in the current survey cycle to determine if any appear to be duplicates. When projects with the same or similar characteristics are identified for both survey cycles, the relevant institutions are contacted to discuss these projects. With the approval of each institution, the projects are eliminated from the institution's new construction data for the appropriate cycle. In addition, the data on the source of funding of new construction projects are revised to reflect the deletion of these projects.
Individuals wishing to analyze trends other than those published in NCSES's most recent publications are encouraged to contact the project manager below for more information about comparability over time.
7. Data Products
- Publications: The data from this survey are published biennially in detailed statistical tables in the series Science and Engineering Research Facilities. The most recent report in this series is available at http://www.nsf.gov/statistics/facilities/.
Information from this survey is also included in Science and Engineering Indicators.
- Electronic access: To make the survey data most useful to survey respondents, microdata beginning with the FY 2003 survey are available in the NSF WebCASPAR data system. Due to a confidentiality pledge, microdata from this survey for years 1988 through 2001 are not available.
8. Contact Information
For additional information about this survey, please contact the Project Officer.
Michael T. Gibbons
Research and Development Statistics Program
National Center for Science and Engineering Statistics
National Science Foundation
4201 Wilson Boulevard, Suite 965
Arlington, VA 22230
Phone: (703) 292-4590