Survey Methodology

Response Rates and Imputation for Nonresponse


Response Rates TOP

By the FY 1992 survey closing date of July 28, 1993, completed questionnaires had been received from 456 of the 461 academic institutions in the survey sample. This represents a 98.9-percent response rate, including 100 percent of the Top 100 institutions. Responses were received from 99 percent of the doctorate-granting institutions, to which 98 percent of the R&D expenditures in the S&E fields was disbursed. In addition, all 19 FFRDCs responded.

During the FY 1991 survey, completed questionnaires were received from 446 of the 459 academic institutions (97.2 percent), all Top 100 institutions, and all 19 FFRDCs. In FY 1991, questionnaires were received from 452 of the 460 academic institutions (98.3 percent), 99 of the Top 100 institutions, and all 18 FFRDCs.

Response Burden TOP

The questionnaire asks respondents to provide the number of person hours required to complete the survey form, and provides a contact person at NSF to whom comments about the response burden can be directed. At the end of the survey cycles, the average number of hours is calculated for those institutions that indicated any response burden. For FY 1992, doctorate-granting institutions reported an average of 21.0 burden hours, compared with 22.5 hours in FY 1991 and 18.3 hours in FY 1990. Master's-granting institutions reported an average of 11.2 burden hours in FY 1992, compared with 9.2 hours in FY 1991 and 8.6 hours in FY 1990. Institutions that grant a bachelor's degree or below reported 4.5 burden hours in FY 1992, 4.0 hours in FY 1991, and 3.3 hours in FY 1990.

Imputation Methodology TOP

In order to provide national totals of all academic R&D expenditures, it is necessary first to develop estimates for the 1 to 3 percent of the survey population that did not respond.

Data imputation is an automated procedure that estimates data for totally and partially nonrespondent institutions. Imputation involves calculating inflator/deflator factors for certain institution classes (determined by each institution's highest degree granted and type of control standing) from fully responding institutions for three key variables: total R&D expenditures, federally financed R&D expenditures, and total research equipment expenditures. The imputation factors are applied to the previous year's key variable values for each nonrespondent institution to derive a current year estimate. These factors, when applied to institutions in each class, reflect the average annual growth or decline in expenditures for reporting institutions in that class. The key variables are then distributed among the various subtotal and detailed fields using the same relative percentages that were last reported by that institution. If no previous percentages are available for an institution, the summary percentages for the institution's class are used.

Imputation of all data fields is performed for totally nonrespondent institutions participating in the previous year's survey. Partial imputation is performed for institutions that omitted data for some of the questionnaire items.

Item Nonresponse and Imputation Rates TOP

Imputation/estimation rates for survey data cells are calculated for all academic institutions, and for various institution classes determined by each institution's highest degree granted and type of control standing. Engineering and mathematical sciences received the lowest imputation rate, 0.4 percent, of all academic S&E fields. Psychology received the highest imputation rate at 2.0 percent. For the sources of funding category at academic universities and colleges, the lowest imputation rate was 0.3 percent for Federal Government and "all other sources" and the highest was 1.7 percent for industry.

Retro-imputation Based Upon Subsequent Data Submissions TOP

A significant number of institutions in the survey universe are intermittent respondents; they provide data one year, do not respond in one or more subsequent years, and then provide data again. Data for the years in which no response is received are imputed, as described in the previous section. Although the imputation algorithm accurately reflects national trends, it cannot account for reporting anomalies at individual institutions. For this reason, after current year imputation, a separate retro-imputation of previous years'data is performed.

For each institution, key variables for items 1-3 that were formerly imputed are compared with subsequent submissions to determine whether the imputed data accurately represented the growth patterns shown by the reported data. Retro-imputation is applied when the imputed data are not consistent with the reported data. If data were reported for FY 1989 and FY 1992 but not for the intervening years, for example, the difference between the reported figures for each item total is calculated and these amounts are then linearly interpolated across the intervening years. The new figures are spread across disciplines or sources of support on the basis of the most recent reporting pattern. These procedures result in more consistent reporting trends for individual institutions but have little effect upon aggregated figures reflecting national totals.