Survey Methodology, FY 1995

Data Collection and Processing Activities


Survey Mailout TOP

The mailout of the R&D expenditures survey packets was completed in November 1995. Included in each mailout packet were the following:

All mailout materials were produced by the contractor, Quantum Research Corporation (QRC). Institutional facsimiles and mailing labels were generated from the final FY 1994 survey database. QRC staff assembled the survey packets, checking them individually to ensure accuracy before mailing. Since QRC had not received the new contract by mailout, all envelopes showed an NSF return address. An Automated Survey Questionnaire (ASQ) diskette, with instructions, was included in the packet mailed to each institution.

Receipt and Processing of Survey Materials TOP

Survey acknowledgment postcards were included in each survey packet for the institution to confirm receipt of the survey. The postcards were returned to NSF and they were stamped with the date received. NSF forwarded the postcard to QRC in April. The postcard requested the expected return date for the completed survey and this information was recorded in each log. The postcard also requested notification of changes in respondent names and mailing addresses, and the database was updated accordingly. Respondents who did not return the postcard were telephoned and additional survey packets were mailed as needed. All contacts, whether made by telephone or by mail, were recorded in the response contact logs.

As with the postcards, completed questionnaires were returned to NSF and they stamped with the date received. NSF forwarded them to QRC in April and QRC recorded them in the contact logs.

Starting in mid-April, at the end of each week a report was generated from the survey database. This weekly report provided the following information on each institution: Top 1002 standing, type of institution, highest degree granted, type of control, State, and stratum code. The report also supplied, for each institution, date of postcard receipt, expected and actual date of survey questionnaire or ASQ receipt, and any change in status during the latest week. Institution counts, aggregated by week, provided similar information on the receipt of postcards, the ASQ, and survey questionnaires for academic institutions, Top 100 institutions, and FFRDCs. A count of institutions by response status code was also included. This report was sent to the NSF project officer each Monday during the survey processing cycle.

Follow-up Activities TOP

Continual update of the contact logs was particularly crucial to followup activities. The logs, which were generated from the information collected in a database screen that was maintained for each institution in the survey population, served as a central location for all receipts, milestones, and comments. The contact logs were updated daily by QRC staff. The logs included each respondent's name, title, telephone number, response status, and a detailed chronology of all contacts. Contact logs were generated as needed for followup activities or as requested by the NSF project officer. The response contact logs could be generated for all institutions or only nonrespondent institutions.

Followup postcards would normally be sent to the institutions that had not acknowledged receipt of their survey packet. The receipt of followup postcards at QRC would be handled in the same manner as the receipt of acknowledgment postcards. Due to the delay in acquiring the FY 1995 bridge contract, QRC did not mail out the followup postcards.

The first round of followup telephone calls began on April 30, 1996, after the survey log-in process had been completed. Contact log reports were again generated to identify all institutions that had not acknowledged receipt of the survey packet (either through an acknowledgment postcard, telephone call, or a followup postcard). The data collection coordinator telephoned those respondents while referring to either the log report or the database. The survey packet was described to the respondent in sufficient detail to determine if it had been received. In cases where the packet had not been received, the address was updated and a new packet was mailed. In cases where the previous year's respondent was no longer available, a new respondent was identified and a new packet was mailed. Institutions were repeatedly telephoned until acknowledgment of all the survey packets had been obtained. All contacts and mailings were recorded in the database.

Contact log reports were generated for institutions that had not returned a completed survey questionnaire by the survey due date. Beginning in mid-May, a second round of followup telephone calls was made to emphasize responding to the survey rather than acknowledging its receipt. These calls were handled in the same manner as the first round, except that additional emphasis was placed on obtaining responses from the Top 100 institutions. All contacts were recorded in the contact logs.

Respondents who indicated that they lacked sufficient time to properly complete the survey were offered a deadline extension. Respondents who showed reluctance to participate were reminded of the national scope of this effort and its consideration by various decision-making bodies, including Congress. Respondents were informed that because the survey was limited to a sample (during most years), each institution's participation was even more crucial. Respondents who continued to have difficulty finding the time to complete the survey or overcoming their reluctance to participate were asked to provide data only for items 1 and 2 (current fund expenditures in S&E, by source of funds and by field of science and engineering). Total figures were requested as a last resort. Respondents who still refused to participate were thanked for their time and informed that NSF personnel might contact them at a future date.

Contact log reports were generated periodically to help identify those institutions requiring additional followup. The data collection coordinator maintained periodic telephone contact with all institutions until all survey data were received.

Beginning in May the NSF project officer was advised of the status of nonrespondent institutions through monthly memoranda. As mentioned earlier, the project officer received weekly reports on the status of survey collection. At the end of each month, the project officer received complete contact log and nonrespondent contact log reports, listings of respondents and addresses, and database status reports showing each institution's response status and item 1 data.

Data Processing TOP

All data entry and editing were accomplished on personal computers using the database management system. Several login procedures were followed in order to prepare the survey questionnaires and the institutional records for data entry. When questionnaires provided new information on respondents (name, title, address, telephone number) or institutions (name, highest degree granted, type of control), these changes were made in the database records. Any significant change in institution status (such as name) was immediately conveyed to the NSF project officer by memorandum.

The receipt of a completed survey questionnaire and any relevant comments made by the respondent were recorded in the institution's contact log. All questionnaires were reviewed by the data collection coordinator in preparation for data entry. When questionnaires were returned with incomplete data, the coordinator determined whether imputation of any missing data cells would be required. Cells requiring imputation were marked in red with a "B" status code. When respondents reported in dollars rather than in thousands of dollars, the data collection coordinator revised these figures in red. Respondents occasionally reported negative numbers (i.e., credits to accounts), which were acceptable only for item 3. If negative numbers were reported for other items, the data were adjusted proportionately to correct the error without altering the subtotal or total figures provided. The revised figures were recorded in red on the questionnaire and assigned an "E" status code to indicate that they were estimates. Each original questionnaire, ASQ diskette, and any other correspondence were filed in survey folders maintained at QRC for each institution.

The final step in preparing for data entry was to change the status of institution records from "awaiting response" to "awaiting processing." For the actual data entry, copies of institution records awaiting processing were sometimes exported (copied) to a diskette or a network drive. This allowed personnel to complete data entry at a different computer while the data collection coordinator continued data processing on the main survey database. Upon completion of data entry, the institutions' records were imported (copied) back to the database, on a daily basis.

The institution response status after data entry became one of the following:

The monthly database status report provided information on institution response status and compared data on source of funds for each institution for the current and the previous year. This year a great deal of the corrections and verifications were handled over the telephone, in order to diminish the respondent's paperwork burden.

Data Editing TOP

Data entered into the database were edited automatically by the database system. All data were checked for arithmetical errors, and data cells were flagged for any large discrepancies between the current and previous year's data (trend warnings). After data entry for an institution was completed, a facsimile was generated containing data for the current year and 2 previous years. The facsimiles identified any arithmetical errors and/or trend warnings. These facsimiles were reviewed for data entry accuracy.

"Edit letters" were generated for all institutions that had arithmetical errors and/or trend warnings. These letters thanked the respondent for participating in the survey, explained the facsimile, provided a brief explanation of discrepancies in the data, and requested revision of data for current and/or previous years. Edit letters and facsimiles were assembled into edit packets and mailed to the respondents. The mailing was recorded in the response contact logs, and copies of the edit packet materials were stored in the file folders maintained for each institution.

Prior to mailing, all edit packets were reviewed by the data collection coordinator. Error messages and trend warning messages were reviewed. Facsimile messages that represented the most significant reporting problems received individual notes in red explaining the discrepancy in detail. The data collection coordinator corrected or verified the data for institutions that had minor errors or trend warnings, as necessary. Any altered data cells were marked with an "E" status code in the database to indicate that they were estimates, and the contact logs were updated to reflect that the coordinator had corrected or verified the data. In these cases the institutional respondents were often contacted by telephone, informed of the changes, and allowed an opportunity to adjust the data.

Respondents who did not return the edit packets within 2 weeks of mailing were identified through periodically generated contact log reports. These respondents were contacted by telephone to resolve data errors and explain trend warnings.

The data collection coordinator recorded the receipt of edit packets in the contact logs, along with any comments. A new facsimile was generated after all data revisions were made to the institution's screens in the database. If the institution's status code was awaiting imputation or clean, a facsimile with a thank-you letter was mailed to the institution and a copy of the data revision was filed. When a returned edit packet did not change the institution's status to awaiting imputation or clean, the respondent was telephoned for additional data revision or explanation.

To ensure quality control, preliminary ranking tables were produced several weeks prior to closeout for the NSF project officer's review. These tables provided several years of data for individual institutions by sources of funding and major fields of science and engineering. The preliminary tables provided a means to check for irregularities or unusual trend changes in reported data.

The NSF project officer visited QRC to review any major trend fluctuations in institutional data 2 weeks before closeout. Any institution that had not returned an edit packet by this date or that was still awaiting correction verification was inspected. The NSF project officer and the data collection coordinator developed estimates-marked with an "E" status code-correcting or verifying these institutions' data. If an edit packet was received for one of these institutions prior to closeout, the estimates were replaced by the reported data.

The survey questionnaires received from the institutions are stored in each institution's file folder. The questionnaires are retained for 2 survey years, after which they are discarded.

Survey Closeout TOP

The survey closeout was August 30, 1996; no completed survey questionnaires or edit letters were received after this date. Typically any data that are received after the database is closed are set aside until the database is expanded for the next fiscal year. Then all revisions are added to the database prior to the next year's mailout so that the changes will be reflected in the survey facsimiles.

Database Management System TOP

QRC has developed a database management system for processing NSF's R&D expenditures survey. The system maintains institutional data for the current and 16 previous years in a personal computer database. The system consists of a series of nine screens for each institution in the survey population. An institution screen provides identifying information including the respondent's name, address, and telephone number. Screens for each contact log are used to maintain a chronological listing of contacts with the institution. The remaining seven screens correspond to items on the survey questionnaire. The institutions are arranged in the database in ascending sequence of Federal Interagency Committee on Education (FICE) codes. Database use was restricted to QRC project personnel by utilizing high-level and low-level passwords. Five complete backups of the database were maintained at all times. Daily backups were made to one of three locations on the local area network and a weekly backup was made to one of two Bernoulli storage cartridges. All output was produced on a Hewlett Packard LaserJet printer.

Automated Survey Questionnaire (ASQ) TOP

QRC has developed an Automated Survey Questionnaire, which is a front-end database system that allows respondents to complete the R&D expenditures survey by computer rather than by hand. This program was included in all survey mailout packets, along with a full set of usage instructions. For institutions with advanced computer systems, an accompanying file format description allowed direct data entry either on the diskette or into a file that could be sent through File Transfer Protocol (FTP) or e-mail to QRC. Use of the ASQ provided respondents with the opportunity to edit data for both arithmetical errors and trend warnings before returning the questionnaire on diskette to QRC. Respondents could also include comments on the diskette. In most cases, data submitted through the ASQ required less followup and editing than questionnaires received in hard copy.

For the FY 1995 survey, 265 of the 469 institutions participating in the survey returned the ASQ diskettes.

Progress Reports TOP

In addition to the contact log reports, coordinator listings, and status reports that were generated monthly, QRC prepared a monthly progress report for NSF describing all survey activities performed. During the survey processing cycle, Microsoft Excel charts were prepared showing the FY 1995 and FY 1994 weekly survey responses for all institutions, Top 100 institutions, and FFRDCs. Attached to the monthly reports were monthly and year-to-date financial summaries.

[ Survey Population and Sample Design | Data Collection and Processing Activities ]
[ Response Rates and Imputation for Nonresponse | Data Weighting and Standard Errors of Measurement ]