Science and Engineering Research Facilities: Fiscal Year 2003.

Appendix A. Technical Notes

Scope of Survey Top.

The data presented in these tables are collected biennially through the National Science Foundation's (NSF) congressionally mandated Survey of Science and Engineering Research Facilities (Facilities Survey). The survey originated in 1986 in response to Congress's concern about the state of research facilities at the nation's colleges and universities. NSF's 1984 reauthorization legislation, P.L. 99-159, mandated a data collection and analytic system to identify and assess the research facilities needs of academic institutions.

The National Institutes of Health (NIH) have cosponsored all cycles of the survey.

Recognizing the expanding use of networking and computing capacity in conducting research, a new set of questions was added to the FY 2003 Facilities Survey.

Population Top.

The 2003 population consisted of 465 research-performing academic institutions[1] and 191 nonprofit biomedical research institutions in the United States. Research-performing academic institutions were defined as colleges and universities with $1 million or more in research and development (R&D) expenditures. Each academic institution's level of R&D expenditures was determined by the 2002 NSF Survey of Research and Development Expenditures at Universities and Colleges. Military institutions, Veteran's Administration institutions, and federally funded R&D centers (FFRDCs) were excluded. The biomedical institution frame was a list of nonprofit biomedical research organizations and hospitals in the United States that received at least $1 million in NIH research funding in FY 2002.

Data Definitions Top.

Research is all sponsored science and engineering R&D activities that are separately budgeted and accounted for. Research can be funded by the institution itself, the federal government, a state government, foundations, corporations, or other sources.

Research space includes the following examples: controlled-environment space, such as clean or white rooms; technical support space, such as preparation areas, carpentry and machine shops; laboratories and associated support areas used exclusively for animal research, such as procedure rooms, bench space, animal production colonies, holding rooms, germ-free rooms, surgical facilities, and recovery rooms; offices, to the extent that they are used for research activities; space used for research containing fixed equipment such as fume hoods; space used for research containing nonfixed equipment costing $1 million or more each, such as MRIs; and leased space that is used for research.

Net assignable square feet (NASF) is the sum of all areas on all floors of a building assigned to, or available to be assigned to, an occupant for a specific use, such as research or instruction. NASF is measured from the inside faces of walls.

Gross square feet is based on the floor area of a structure within the outside faces of the exterior walls.

Laboratories are areas with special-purpose equipment or configurations designed to meet the research needs of a particular discipline or a closely related group of disciplines.

Laboratory support space is area necessary to support research laboratories, such as autoclave rooms, darkrooms, equipment areas, and storage areas for research equipment and supplies.

Offices include offices for faculty, staff, and other persons, to the extent that they are used for research, including administrative activities for specific research projects.

Other research space includes all other space used for research.

Biosafety level (BL) designates a typology of animal research and is measured at four levels: BL-1 involves working with defined and characterized strains of viable microorganisms not known to cause disease in healthy adult humans; BL-2 involves working with the broad spectrum of indigenous moderate-risk agents present in the community and associated with human disease of varying severity; BL-3 involves working with indigenous or exotic agents with a potential for respiratory transmission and that may cause serious and potentially lethal infection; and BL-4 involves working with dangerous and exotic agents that pose a high individual risk of life-threatening disease, that may be transmitted via the aerosol route, and for which there is no available vaccine or therapy.

Repairs and renovations refer to activities such as fixing up facilities in deteriorated condition, capital improvements on facilities, and conversion of facilities.

New construction refers to construction of a new building, additions to an existing building, and the building out of shell space.

Completion costs include those for planning, site preparation, construction, fixed equipment, and building infrastructure such as plumbing, lighting, air exchange, and safety systems either in the building or within 5 feet of the building foundation. Costs of nonfixed equipment are included only if they equal $1 million or more.

Institutional funds and other sources include the following examples: operating funds, endowments, tax-exempt bonds and other debt financing, indirect costs recovered from federal grants/contracts, and private donations.

Current program commitments are all research activities of an institution that are budgeted, approved, and funded. It includes current faculty and staff or those to whom offers have been made; grants awarded, whether research has actually begun; and programs that have been approved.

Deferred projects are those that: (1) are not funded and (2) are not scheduled for FY 2004 or FY 2005. They do not include projects planned for developing new programs or expanding current programs.

Local area network (LAN) is a network of interconnected workstations sharing the resources of a single processor or server within a relatively small geographical area, typically within a building or laboratory.

Campus backbones are connections between LANs.

Desktop ports are connections between individual PCs or workstations and the LAN or campus backbone.

Internet2 is a consortium of universities, industry, and government working to develop and deploy advanced network applications and technology. Members are connected through an advanced backbone network named Abilene.

Computation rate is the number of operations a computer (or set of computers) can perform per second while working on a single application.

High-performance computing could include either a large-capacity mainframe computer or the use of parallel or distributed processing software to spread a single application over multiple computers. In either case, the purpose would be to manipulate very large amounts of data in a very short time.

Grid technology is hardware and software infrastructure that integrates a collection of resources such as high-end computers, instruments, applications, databases, and networks in order to collaborate across geographically distributed sites.

Changes in Reporting Top.

Since these data were last collected in 2001, several changes have been made to the population, some of the survey questions, and the release of public data. Some of the changes include:

Analytic Definitions Top.

Several analytic subgroups are presented in the table data. These subgroups are defined as follows.

Geographic regions. States may be divided into the four U.S. geographic regions defined by the U.S. Census Bureau. These are:

EPSCoR. States may be grouped according to their eligibility for NSF or NIH funding. States are eligible for the NSF Experimental Program to Stimulate Competitive Research (EPSCoR) if they have historically received less federal R&D funding than other states. The purpose of the program is to increase the R&D funding competitiveness of these states by assisting in the development and utilization of science and technology resources located at the major universities. The states currently eligible for this program are as follows:

IDeA. NIH sponsors the Institutional Development Award (IDeA) program. This program was established in 1993 in order to enhance the competitiveness for research funding of institutions located in states with historically low aggregate success rates for NIH grant applications. The goal is to broaden the geographic distribution of NIH funding for health research. The states currently eligible for this program are as follows:

Institutional control. This is defined for academic institutions as private or public.

Medical school. All institutions defined as having a medical school include only those with medical schools that award M.D. degrees.

Response Rate Top.

The 2003 survey was mailed to academic and biomedical institutions in November 2003 and data collection ended May 21, 2004.

Of the 465 academic institutions, 92 percent returned surveys. Of the 191 biomedical organizations, 94 percent returned surveys.

Weighting Top.

The 2003 Facilities Survey attempted to obtain responses from all institutions in the defined population. Consequently, one of the usual sources of survey error, sampling error, is not of concern in this survey. However, as is the case in almost all surveys, nonresponse error is of concern. In the 2003 Facilities Survey, 92 percent of all eligible institutions responded.

Weights were used to account for unit nonresponse. The weights for the academic institutions were adjusted for the known number of academic institutions by: expenditure categories (the quintiles of the distribution), census region, control (public/private), whether the institution was a historically black college or university, and whether the institution granted Ph.D. degrees. For the biomedical institutions the only auxiliary variables were the grant amount (quintiles of the distribution) and census region. The minimum weights for both academic and biomedical institutions were constrained to be at least 1.0.

The FY 2003 Facilities Survey detailed statistical tables contain two sets of data, part 1 (research space) and part 2 (computing and networking). The data in all part 1 tables is weighted according to the previously described procedures except the data presented by state (i.e., tables 13, 14, 20, 21, 27, 28, 31, 32, 41, 42, 45, 46, 49, 50, 61, 62, 65, 66, 69 and 70). None of the data in the part 2 tables (i.e., tables 86 to 107) is weighted. The part 2 data are not weighted due to potential measurement error within the survey responses. It is believed that substantially greater measurement error may exist in the part 2 data because FY 2003 was the first year of implementation of these questions and because of the rapidly changing nature and variability of the part 2 data. Likewise, item nonresponse is not imputed for part 2 questions.

Item Nonresponse Top.

A series of logistic regression models and linear regression models were developed and used to impute the values for all missing data for institutions that responded to the survey. The predicted values from these models were used to impute for the missing responses, although in some cases stochastic imputations were used to better reproduce expected distributions. The imputation was done for academic data and biomedical data separately. The models for imputing the academic data were developed first and similar models were then applied to impute the biomedical data, to the extent possible.

A set of core predictors was used for imputing most items across the two types of institutions, but differences in the available data by type of institution limited this process to some degree. For academic institutions, the core predictors were: control (public/private), highest degree granted (doctorate/nondoctorate), existence of a medical school, FY 2002 total research and development expenditures (overall), and total NASF. For biomedical institutions, the core predictors were: status as a hospital or other biomedical institution, FY 2002 eligible NIH grant awards, and total NASF.

The items were first classified into two categories based on the item nonresponse rates as those with item nonresponse rate greater than 5 percent and with more than 10 units (institutions) missing and all other items. For the items with rates of less than 5 percent, the core predictors and other variables needed to preserve any skip patterns were used in the regressions. For the items with higher nonresponse rates and a few key items used for most analyses, exploratory analysis was done to try to improve the model fit for these items by including other predictor variables.

Tables showing data by state and control (i.e., public versus private) and individual institution tables are based on unimputed data. In the individual institution tables, the data for Johns Hopkins University include data for the Applied Physics Laboratory.

Data Availability Top.

Data published in this report are also available on the World Wide Web and can be found at Data are also available for this and other surveys through the Web-Based Computer-Aided Science Policy Analysis and Research (WebCASPAR) database system, which can be accessed via the Web at


[1] Johns Hopkins University and Applied Physics Lab completed separate survey forms, but their data were combined on the data file and are treated as a single institution in all published tables and study reports. The final population of 465 counts Johns Hopkins University and Applied Physics Lab as a single institution.

Previous Section. Top of page. Next Section. Table of Contents. Help. SRS Homepage.