S&E Research Facilities: FY 2005
Appendix A. Technical Notes
Scope of Survey
The data presented in these tables are collected biennially through the National Science Foundation's (NSF) congressionally mandated Survey of Science and Engineering Research Facilities (Facilities Survey). The survey originated in 1986 in response to Congress's concern about the state of research facilities at the nation's colleges and universities. NSF's 1984 reauthorization legislation, P.L. 99-159, mandated a data collection and analytic system to identify and to assess the research facilities needs of academic institutions. The National Institutes of Health (NIH) have cosponsored all cycles of the survey.
Recognizing the expanding use of networking and computing capacity in conducting research, a new set of questions on these topics was added to the FY 2003 Facilities Survey.
The FY 2005 population consisted of 477 research-performing academic institutions and 191 nonprofit biomedical research institutions in the United States. Research-performing academic institutions were defined as colleges and universities with $1 million or more in research and development (R&D) expenditures. Each academic institution's level of R&D expenditures was determined by the 2004 NSF Survey of Research and Development Expenditures at Universities and Colleges. Military institutions, Veterans Administration institutions, and federally funded R&D centers (FFRDCs) were excluded. The biomedical institution frame was a list of nonprofit biomedical research organizations and hospitals in the United States that received at least $1 million in NIH research funding in FY 2004.
Research is all sponsored science and engineering R&D activities that are separately budgeted and accounted for. Research can be funded by the institution itself, the federal government, a state government, foundations, corporations, or other sources.
Research space includes the following examples: controlled-environment space, such as clean or white rooms; technical support space, such as preparation areas, carpentry and machine shops; laboratories and associated support areas used exclusively for animal research, such as procedure rooms, bench space, animal production colonies, holding rooms, germ-free rooms, surgical facilities, and recovery rooms; offices, to the extent that they are used for research activities; space used for research containing fixed equipment such as fume hoods; space used for research containing nonfixed equipment costing $1 million or more each, such as MRIs; and leased space that is used for research.
Net assignable square feet (NASF) is the sum of all areas on all floors of a building assigned to, or available to be assigned to, an occupant for a specific use, such as research or instruction. NASF is measured from the inside faces of walls.
Gross square feet is based on the floor area of a structure within the outside faces of the exterior walls.
Biosafety level (BL) designates a typology of animal research and is measured at four levels: BL-1 involves working with defined and characterized strains of viable microorganisms not known to cause disease in healthy adult humans; BL-2 involves working with the broad spectrum of indigenous moderate-risk agents present in the community and associated with human disease of varying severity; BL-3 involves working with indigenous or exotic agents with a potential for respiratory transmission and that may cause serious and potentially lethal infection; and BL-4 involves working with dangerous and exotic agents that pose a high individual risk of life-threatening disease, that may be transmitted via the aerosol route, and for which there is no available vaccine or therapy.
Repairs and renovations refer to activities such as fixing up facilities in deteriorated condition, capital improvements on facilities, and conversion of facilities.
New construction refers to construction of a new building, additions to an existing building, and the building out of shell space.
Completion costs include those for planning, site preparation, construction, fixed equipment, and building infrastructure such as plumbing, lighting, air exchange, and safety systems either in the building or within 5 feet of the building foundation. Costs of nonfixed equipment are included only if they equal or exceed $1 million.
Institutional funds and other sources include the following examples: operating funds, endowments, tax-exempt bonds and other debt financing, indirect costs recovered from federal grants/contracts, and private donations.
Current program commitments are all research activities of an institution that are budgeted, approved, and funded. It includes current faculty and staff or those to whom offers have been made; grants awarded, whether research has actually begun; and programs that have been approved.
Deferred projects are those that: (1) are not funded and (2) are not scheduled for FY 2004 or FY 2005. They do not include projects planned for developing new programs or expanding current programs.
Bandwidth is the amount of data that can be transmitted in a given amount of time, usually measured in bits per second.
Commodity internet is the general public, multiuse network often called the "Internet."
Abilene is a high-performance backbone network managed by the Internet2 consortium of academia, industry, and government.
Desktop ports are connections among individual personal computers or workstations and the local area network or campus backbone.
Internet2 is a consortium of universities, industry, and government working to develop and deploy advanced network applications and technology. Members are connected through an advanced backbone network named Abilene.
High-performance computing performs at the fastest rate currently available, manipulating a very large amount of data in a short time.
Changes in Reporting
Since these data were last collected in the FY 2003 survey, several changes have been made to some of the survey questions, including:
In addition, the survey questions on the Computing and Networking section of the survey were significantly revised. Most of the FY 2003 questions were replaced with more current questions to reflecting changing technology. However, the topics covered in the section generally remained the same (e.g., networking, high-performance computing, wireless coverage).
Several analytic subgroups are presented in the table data. These subgroups are defined as follows.
Geographic regions. States may be divided into the four U.S. geographic regions defined by the U.S. Census Bureau. These are:
Guam, Puerto Rico, and the U.S. Virgin Islands are excluded from the geographic regions but are included in the national statistics and other appropriate aggregate figures.
EPSCoR. States may be grouped according to their eligibility for NSF or NIH funding. States are eligible for the NSF Experimental Program to Stimulate Competitive Research (EPSCoR) if they have historically received less federal R&D funding than other states. The purpose of the program is to increase the R&D funding competitiveness of these states by assisting in the development and utilization of science and technology resources located at the major universities. The states currently eligible for this program are as follows:
IDeA. NIH sponsors the Institutional Development Award (IDeA) program. This program was established in 1993 in order to enhance the competitiveness for research funding of institutions located in states with historically low aggregate success rates for NIH grant applications. The goal is to broaden the geographic distribution of NIH funding for health research. The states currently eligible for this program are as follows:
Institutional control is defined for academic institutions as private or public.
Medical school is a school that awards an M.D. degree or an osteopathic medicine degree.
The FY 2005 survey was mailed to academic and biomedical institutions in October 2005 and data collection ended May, 2006. Of the 477 academic institutions, 95% returned surveys. Of the 191 biomedical organizations, 93% returned surveys.
The FY 2005 Facilities Survey attempted to obtain responses from all institutions in the defined population. Consequently, one of the usual sources of survey error, sampling error, is not of concern in this survey. However, as is the case in almost all surveys, nonresponse error is of concern. In the FY 2005 Facilities Survey, 94% of all eligible institutions responded.
Weights were used to account for unit nonresponse. The weights for the academic institutions were adjusted for the known number of academic institutions by: expenditure categories (the quintiles of the distribution), census region, control (public/private), whether the institution was a historically black college or university, and whether the institution granted Ph.D. degrees. The weights for the biomedical institutions were adjusted for the known number of biomedical institutions by the grant amount (quintiles of the distribution) and census region. The minimum weights for both academic and biomedical institutions were constrained to be at least 1.0.
The FY 2005 Facilities Survey Detailed Statistical Tables contain two sets of data, part 1 (research space) and part 2 (computing and networking). The data in all part 1 tables are weighted according to the previously described procedures except the data presented by state (i.e., tables 12, 13, 19, 20, 23, 24, 32, 33, 36, 37, 40, 41, 52, 53, 56, 57, 60, and 61). None of the data in the part 2 tables (i.e., tables 78–98) is weighted. The part 2 data are not weighted due to potential measurement error within the survey responses. It is believed that substantially greater measurement error may exist in the part 2 data because this data collection is new and because of the rapidly changing nature and variability of the part 2 data. Likewise, item nonresponse is not imputed for part 2 questions.
A series of logistic regression models and linear regression models were developed and used to impute the values for all missing data for institutions that responded to the survey. The predicted values from these models were used to impute for the missing responses, although in some cases stochastic imputations were used to better reproduce expected distributions. The imputation was done for academic data and biomedical data separately. The models for imputing the academic data were developed first and similar models were then applied to impute the biomedical data, to the extent possible.
A set of core predictors was used for imputing most items across the two types of institutions, but differences in the available data by type of institution limited this process to some degree. For academic institutions, the core predictors were: control (public/private), highest degree granted (doctorate/nondoctorate), existence of a medical school, FY 2004 total research and development expenditures (overall), and total NASF. For biomedical institutions, the core predictors were: status as a hospital or other biomedical institution, FY 2004 eligible NIH grant awards, and total NASF.
The items were first classified into two categories based on the item nonresponse rates as those with item nonresponse rate greater than 5% and with more than 10 units (institutions) missing and all other items. For the items with rates of less than 5%, the core predictors and other variables needed to preserve any skip patterns were used in the regressions. For the items with higher nonresponse rates and a few key items used for most analyses, exploratory analysis was done to try to improve the model fit for these items by including other predictor variables.
Tables showing data by state and control (i.e., public versus private) and individual institution tables are based on unimputed data. In the individual institution tables, the data for Johns Hopkins University include data for the Applied Physics Laboratory.
Comparability of Statistics
This section summarizes major survey improvements and changes in procedures/practices that may have affected the comparability of statistics produced from the Survey of Science and Engineering Research Facilities over time.
Beginning with the FY 2003 cycle and continuing with the FY 2005 cycle, respondents were requested to provide data on their institution's individual, new construction projects. Respondents provided several types of data for each project including name, gross square feet, net assignable square feet, and cost of project. Using this information, it was possible to compare the new construction projects reported by each institution in FY 2003 to the projects the same institution reported in FY 2005 to determine if any appeared to be duplicates.
This comparison identified 36 projects at academic institutions with the same or similar characteristics. Contact with the relevant institutions indicated that 9 projects should not have been reported in the FY 2003 survey. With the approval of each institution, these projects were eliminated from their new construction data.
Also, the data on the source of funding of new construction projects was revised to reflect the deletion of these projects. The nine new construction projects that were removed from the FY 2003 data affected the records of eight institutions. For three institutions, the removal eliminated all new construction projects reported; as a result, all funds reported by source for new construction were also deleted.
For the remaining five institutions, at least one other reported construction project remained. Costs associated with the deleted projects(s) were subtracted from the sources of funds total for each institution. The remaining funds were reallocated to source by distributing the remaining funds across sources using the same allocation that had been initially reported by each institution.
Finally, the regression models used to impute the FY 2003 new construction and source of funding data were rerun with the new data. The FY 2003 data related to new construction and source of funding for new construction shown in the FY 2005 tables reflect the revised data.
Data published in this report are also available on the World Wide Web and can be found at http://www.nsf.gov/statistics. Data are also available for this and other surveys through the Integrated Science and Engineering Resources Data System (WebCASPAR), which can be accessed via the Web at http://webcaspar.nsf.gov/. All microdata (except confidential items on condition of space and research animal space) for part 1 and part 2 are available in the data file called NSF Survey of Science and Engineering Research Facilities (Not Weighted or Imputed) in the WebCASPAR database system.
 Johns Hopkins University and Applied Physics Lab completed separate survey forms, but their data were combined on the data file and are treated as a single institution in all published tables and study reports. The final population of 477 counts Johns Hopkins University and Applied Physics Lab as a single institution.