Survey Methodology

Data Collection and Processing Activities


Survey Mailout TOP

The mailout of the R&D Expenditures Survey packets is typically completed in late October. Included in each mailout packet are the following:

Optional survey items were included with both the FY 1991 and FY 1992 survey questionnaires. In each year, the optional survey items asked for additional information about the components of institutional funding (item 1 of the survey). In FY 1992, the optional survey item also asked for the amounts of total and federally financed R&D expenditures that were funded through nondepartmental organized research units (ORUs). The optional survey items on institutional funds were designed to ensure complete reporting of all institutional funding components and to further clarify NSF's understanding of the components reported. The item on ORUs was designed to measure the amount of research funded through ORUs, which respondents sometimes have difficulty assigning to the appropriate S&E field.

All mailout materials are produced by the contractor, Quantum Research Corporation (QRC). Institution facsimiles and mailing labels are generated from the previous year's survey database. QRC staffers assemble the survey packets, checking them individually to ensure accuracy before mailing. All survey materials are returned to QRC (except during new contract years when materials are returned to NSF and forwarded to QRC). Institutions that requested an Automated Survey Questionnaire (ASQ) in a previous year also receive an ASQ diskette and instructions in their mailout packets. The most recent questionnaire form and other mailout materials are available.

Receipt and Processing of Survey Materials TOP

Survey acknowledgment postcards are included in each survey packet. As the postcards are received, the receipts are recorded in the appropriate institution response control logs of the database. The postcard requests the expected return date for the completed survey, which is also recorded in the institution response control logs. The postcard also requests notification of changes in respondent names and mailing addresses, and the database is updated accordingly. Respondents who do not return the postcard are telephoned and additional survey packets are mailed as needed. All contacts, mailings, and receipts are recorded in the institution response control logs.

As with the postcards, completed questionnaires are returned to NSF, stamped with the date received, and forwarded to QRC, where the receipts are recorded in the institution response control logs.

At the end of each week a report is generated from the survey database. This weekly report provides the following information on each institution: Top 100 or other standing, type of academic institution, highest degree granted, type of control, state, and stratum code. The report also provides, for each institution, date of postcard receipt, expected and actual receipt date of survey questionnaire, whether the ASQ has been requested and receipt date, and a flag marking a change in any of these conditions during the latest week. Institution counts, aggregated by week, provide similar information on the receipt of postcards, ASQ, and survey questionnaires for academic institutions, Top 100 institutions, and FFRDCs. A count of institutions by response status code is also displayed. This report is sent to the NSF project officer each Monday during the survey processing cycle.

Follow-up Activities TOP

Continual update of the institution response control logs in the database is particularly crucial to follow-up activities. A log is maintained for each institution and serves as central location for all receipts, milestones, and comments. The response control logs are updated daily by QRC. The logs include each respondent's name, title, telephone number, response status, and a detailed chronology of all contacts. Response control logs are generated as needed for follow-up activities or as requested by the NSF project officer. It is possible to generate response control logs either for all institutions or for only those that have not yet responded.

A list of nonrespondents is generated from the control logs, and in early December, follow-up postcards are sent to all institutions that have not acknowledged receipt of the survey packet.

The date of mailing a follow-up postcard is recorded in the institution response control screen. The receipt of follow-up postcards is handled in the same manner as the receipt of acknowledgment postcards.

Follow-up telephone calls begin 2 weeks after the follow-up postcards are mailed to the institutions. A control log listing is again generated to identify all institutions that have not acknowledged receipt of the survey packet (either through an acknowledgment postcard, telephone call, or a follow-up postcard). The data collection coordinator telephones nonrespondents while referring either to the log listing or the database. The survey packet is described to the respondent in sufficient detail to determine whether or not it has been received. In cases where the packet has not been received, the address is updated and a new packet is mailed. In cases where the previous year's respondent is no longer available, a new respondent is identified and a new packet is mailed. Institutions are telephoned until acknowledgment of the survey packet has been obtained. All contacts and mailings are recorded in the institution response control log.

A response control log is generated of all institutions that have not returned a completed survey questionnaire. Beginning 2 weeks after the survey due date, a second round of follow-up telephone calls begins that emphasizes responding to the survey rather than acknowledging its receipt. These calls are handled in the same manner as the first round, except that additional emphasis is placed on obtaining responses from the Top 100 institutions. All contacts are recorded in the institution response control log.

Respondents who indicate that they lack sufficient time to properly complete the survey are offered a deadline extension. Respondents who show reluctance to participate are reminded of the national scope of this effort and its consideration by various decision making bodies, including Congress. During a sample survey year, respondents are informed that the survey is limited to a sample, thus making each institution's participation even more crucial. Respondents who continue to have difficulty in finding the time to complete the survey or are relunctant to participate are asked to provide only data for item 1--current fund expenditures for separately budgeted R&D in the sciences and engineering, by source of funds. Respondents who still refuse to participate are thanked for their time and informed that NSF personnel may contact them at a future date.

Response control logs are generated periodically to help identify those institutions requiring additional contact. The data collection coordinator maintains periodic telephone contact with nonrespondent institutions until completed survey questionnaires are received.

Beginning in March the NSF project officer is advised of the status of nonrespondent institutions through monthly memoranda. As mentioned earlier, the project officer receives weekly reports on the status of data collection. At the end of each month, the project officer receives an up-to-date log report, a listing of respondents and their address information, and a database status reports showing institution response status.

Data Processing TOP

All data entry and editing are accomplished on personal computers using the database management system. Several log-in procedures are followed in order to prepare the survey questionnaires and the institutional records for data entry. When questionnaires provide new information on respondents (name, title, address, telephone number) or institutions (name, highest degree granted, type of control), these changes are made in the database records. Any significant change in institution status (such as name) is immediately conveyed to the NSF project officer by memorandum.

The receipt of a completed survey questionnaire and any accompanying comments are recorded in the institution response control screen. All survey forms are reviewed by the data collection coordinator in preparation for data entry. When questionnaires are returned with incomplete data, the coordinator determines whether imputation of any missing data cells would be required. Cells requiring imputation are marked in red with a "B" status code. When respondents report in dollars rather than thousands of dollars, the data collection coordinator revises these figures appropriately. Respondents occasionally report negative numbers (i.e., credits to accounts), which are acceptable only for survey item 3. If negative numbers are reported for other items, the data are adjusted proportionately to correct the error without altering the subtotal/total figures reported. The revised figures are recorded in red on the questionnaire and assigned an "E" status code to indicate that they are estimates. Each original survey form is filed in survey folders that are maintained for each institution.

The final step in preparing for data entry is to change the status of institution records from "awaiting response" to "awaiting processing." For the actual data entry, copies of institution records awaiting processing are sometimes exported (copied) to a diskette or a network drive. This allows personnel to perform data entry at a remote computer while the data collection coordinator continues data processing on the main survey database. Upon completion of data entry the institutions' records are imported (copied) back to the database.

The institution response status after data entry becomes one of the following:

The monthly database status report provides information on institution response status and compares data on source of funds for each institution for the current and the previous year.

Data Editing TOP

Data entered into the database are edited automatically by the database system. All data are checked for arithmetic summation errors and data cells are flagged for any large discrepancies between the current and previous year's data (trend warnings). Once data entry for an institution is complete, a facsimile is generated that contains data for the current year and two previous years. The facsimiles identify all arithmetic errors and/or trend warnings, if any. These facsimiles are reviewed for data entry accuracy.

"Edit letters" are generated for all institutions that have arithmetic errors and/or trend warnings. These letters thank the respondent for participating in the survey, explain the facsimile, provide a brief explanation of discrepancies in the data, and request revision of data for current and/or previous years. Edit letters and facsimiles are assembled into "edit packets" and mailed to the respondents. Mailing dates are recorded in the response control logs, and copies of the edit packet materials are stored in the survey folders maintained for the institutions.

Prior to mailout, all edit packets are reviewed by the NSF project officer and the data collection coordinator. Error messages and trend warning messages are reviewed. Facsimile messages that represent the most significant reporting problems receive individual notes in red explaining the discrepancy in detail. On occasion the data collection coordinator corrects or verifies the data for institutions that have minor errors (usually rounding errors) or minor trend warnings. Altered data cells receive an "E" status code in the database to indicate that they are estimates, and the response control logs are updated to reflect that the data were corrected or verified. In these cases the institutional respondents are usually contacted by telephone, informed of the changes, and provided an opportunity to adjust the data.

Respondents who do not return their edit packets within 2 weeks of mailing are identified through periodically generated logs reports. These respondents are contacted by telephone to resolve data errors and explain trend warnings.

The data collection coordinator records the receipt of edit packets in the response control screens, along with any comments. A new facsimile is generated after all data revisions are made to the institution's screens in the database. If the institution's status code is awaiting imputation or clean, a facsimile is mailed to the institution and a copy of the data revision is filed. When a returned edit packet does not change the response status to awaiting imputation or clean, the respondent is telephoned for additional data revision or explanation.

To ensure quality control, preliminary ranking tables are produced several weeks prior to close-out and reviewed by the data collection coordinator and the NSF project officer. These tables provide several years of data by source of funds and by the major fields of science and engineering. The ranking tables provide an added check for irregularities or unusual trend changes in reported data.

Approximately two weeks before survey close-out, the NSF project officer visits QRC to review any major trend fluctuations in institutional data. All institutions that are still are reviewed. The NSF project officer and the data collection coordinator develop estimates--marked with an "E" status code--correcting or verifying these institutions' data. If an edit packet is received for one of these institutions prior to close-out, the estimates are replaced by the new data.

The survey questionnaires received from the institutions are stored in each institution's file folder. The questionnaires are retained for 2 additional survey years, after which time they are discarded.

Survey Close-Out TOP

The survey typically closes out in July. Any data that are received after the database is closed are set aside until the database is expanded for the next fiscal year. Then all revisions are added to the database prior to the next year's mailout so that the changes will be reflected in the survey facsimiles.

Database Management System TOP

QRC has developed a Statistical Database Management System for processing NSF's R&D Expenditures Survey. The system maintains institutional data in a microcomputer database for FY 1979 through the current survey year. The system consists of a series of 10 screens for each institution in the survey population. An institution screen provides identifying information including the respondent's name, address, and telephone number. Institution response control screens are used to maintain a chronological listing of contacts with the institution. The remaining seven screens correspond to items on the survey questionnaire. The institutions are arranged in the database in ascending sequence of Federal Interagency Committee on Education (FICE) codes. Database use is restricted to QRC project personnel by utilizing high-level and low-level passwords. Four complete backups of the database are maintained at all times. A daily backup is made to one of two sets of diskettes and a weekly backup is made to both the QRC local area network (LAN) and a Bernoulli storage cartridge. All output is produced on a Hewlett Packard LaserJet printer.

Automated Survey Questionnaire TOP

QRC has developed an Automated Survey Questionnaire (ASQ), a front-end database system that allows respondents to complete the R&D Expenditures Survey by computer rather than by hand. A full set of instructions is included with the diskette. For institutions with advanced computer systems, an accompanying file format description allows direct data entry either on the diskette or into a file that could be sent to QRC. Occasionally these files are sent to QRC via the Internet. In addition,ASQ provides respondents with the opportunity to edit data for both arithmetic errors and trend warnings before returning the questionnaire (on diskette) to NSF. The respondents can also include comments on the diskette. An ASQ diskette and instructions are included in the mailout packets for institutions that requested ASQ in a previous year. In most cases, data submitted through ASQ are error-free and require no further editing by QRC or NSF.

Progress Reports TOP

In addition to the response logs, coordinator listings, and status reports generated monthly from the database, QRC prepares a written monthly report for NSF describing all survey activities performed during the month. During the survey processing cycle, Microsoft Excel charts are prepared showing the current and previous year weekly survey responses for all institutions, for Top 100 institutions, and for FFRDCs. Attached to the reports are monthly and year-to-date financial summaries.