Graduate Students and Postdoctorates in Science and Engineering: Fall 2007
Appendix A. Technical Notes
The Survey of Graduate Students and Postdoctorates in Science and Engineering (GSS) is an annual census of all known academic institutions in the United States that grant master's degrees or research doctorates, make postdoctoral appointments, or employ doctorate-holding nonfaculty researchers in science, engineering, and selected health fields. The data collected in the 2007 GSS represent national estimates of graduate student enrollment and postdoctoral employment as of fall 2007.
In 2007 the survey universe consisted of 700 schools at 582 academic institutions: 493 schools at 375 doctorate-granting institutions and 207 schools at 207 master's-granting institutions. Data collected included demographic and funding information for graduate students and postdoctoral appointees, and counts of doctorate-holding nonfaculty researchers.
In the 2007 GSS survey cycle new procedures were introduced to address suspected undercoverage of GSS-eligible units, and the list of GSS-eligible fields was revised. As a result, the survey universe for 2007 is different from that in prior years (see "Survey Instrument and Procedures").
Table A-1 shows the number of institutions, schools, and organizational units (i.e., departments, degree-granting programs, research centers, and health facilities) by degree level covered by the GSS, and shows estimated total enrollment annually between 1966 and 2007. Changes in the survey that affect comparability of these data are as follows:
Survey Instrument and Procedures
The GSS is undergoing a redesign effort that began implementation with the 2007 data collection. Changes to the survey instrument are described below.
Data on graduate student enrollment are collected by field of study from administrative records. A Web survey system is the primary mode of data submission. The survey cycle was launched in October and concluded in June.
The 2007 GSS Web survey consisted of two parts. Part 1 (formerly Form 811) required the identification of "units," a new term that refers to GSS-eligible departments, degree-granting programs, research centers, or health facilities within the reporting school. Beginning with the 2007 data collection, Part 1 could only be completed in the Web survey system.
Part 2 (formerly Form 812) collected counts of graduate students, postdoctoral appointees, and doctorate-holding nonfaculty researchers. A paper worksheet was provided for preparing figures to later be entered in Part 2 of the Web survey. To assist with the transfer of information, the content and table format of the paper worksheet were identical to Part 2 of the Web survey. A small number of school coordinators chose to submit Part 2 data using the paper worksheet.
Institutions select a coordinator for each school that grants a graduate degree in an eligible field. School coordinators for the GSS are responsible for the following:
Revisions Affecting Survey Universe
Units. The Web survey was redesigned in 2007 in an effort to avoid exclusion of eligible degree-granting units and other GSS relevant entities, such as research centers and hospitals affiliated with universities and that employ postdoctoral appointees, nonfaculty researchers, or both. In 2007 an exhaustive list of GSS-eligible degree-granting programs, corresponding GSS-field, and GSS field code was provided to the school coordinator in the survey materials. In previous years this list showed only representative degree-granting programs, GSS fields, and codes. School coordinators use the list to identify units within the scope of the GSS and assign the appropriate GSS field code to those units. School coordinators were instructed to update their unit list with both teaching units and research units. To submit Part 1, the school coordinator was required to confirm that the unit list was complete and that appropriate GSS field codes had been assigned to each unit.
Until 2007 school coordinators were unable to reassign the GSS field for a given unit without deleting the existing unit and replacing it with a new unit with a new field code. Once done, the unit was no longer linked to its prior-year data. In 2007 school coordinators were able to change the field code by selecting a more appropriate field from a list. Consequently, school coordinators may have been more likely to change a unit to its appropriate field.
Fields of study and degree-granting programs. A comprehensive review of GSS-eligible fields led to several changes to the classification scheme, and GSS-eligible degree-granting programs were updated from the 1990 to the 2000 Classification of Instructional Programming (CIP) taxonomy of the National Center for Education Statistics (NCES). Degree-granting programs that had previously been represented by a four-digit CIP code are now represented at the six-digit level of specificity.
Due to these adjustments to the taxonomy and other methodological changes introduced in 2007, data for 2007 are not directly comparable with data from previous years. For trend analyses, the data tables provide estimates of the counts that would have been collected in 2007 had the 2006 methodology been used (see "Bridge-Year Calculation and Display," below).
Revisions to Instructions and Definitions
In 2007 all survey instructions, including definitions, were reviewed and revised to streamline instructions and clarify descriptions of eligibility and definitions of items to be reported. For the Web instrument, this included enhanced help-system capabilities, such as topic searches and navigation.
Bridge-Year Data Calculation and Display
Due to the methodological changes introduced in 2007, including modifications to the set of GSS-eligible fields, most data tables provide data for 2007 in two ways: "2007old" and "2007new." Data shown under 2007old provide estimates of the counts that would have been collected in 2007 had the 2006 methodology been used. Counts reported under 2007new were collected using the methodology introduced in 2007.
To derive counts for 2007old, all units that were reported in the 2006 data collection and retained in 2007 were given the GSS field assigned in 2006. This is consistent with the 2006 GSS coding because the Web survey system before 2007 did not have a direct mechanism for changing GSS codes, and very little recoding was done. Any new unit added in 2007 was given the GSS field code assigned to it, with the following exceptions:
The 2007old counts are based on a subset of the 2007 data due to the first exception listed above. The 2007 old counts are not entirely comparable to 2006 counts because of exclusion of some formerly-eligible units. Of the 12,629 units collected in the 2007 survey, 380 were excluded from the 2007old counts because they would not have been eligible for the 2006 data collection.
A comparison of 2007old with 2007new data reflects differences due to the addition of the three newly added science fields and recoding of units from their 2006 fields to other fields.
For the tables that present only 2007 data in this report, only data for 2007new are presented. For these tables, additional technical tables are available upon request that present the 2007old data.
An interim deadline of November 30 was established for Part 1, the update of the unit list. Schools that missed this Part 1 deadline received special attention from the survey contractor early in the survey cycle. The deadline for submitting data for Part 2 was extended by one month from previous years to the end of February; this extended deadline for Part 2 did not adversely affect response rates or the timely close-out of the survey.
From 2004 through 2006, a unit was considered a complete respondent if it reported complete row and column totals in the three data-collection tables and a partial respondent if it reported only grand totals for these three tables. As in previous years, data tables in the Web survey were prefilled with zeros. In 2007 a checkbox was added above the data tables on each of these screens. The respondent was required to check this box to acknowledge explicitly that the unit had no individuals to report for that particular table, allowing true zeros to be distinguished from nonresponse for the table. Prefilled zeros were considered legitimate responses if the data table screen was visited and left with all zeros in place. Any unit that did not meet the requirements for complete or partial respondent status was considered a nonrespondent.
In 2007 complete row and column totals for all tables were necessary for complete response status in 2007 as well as all details summing to the totals. Tables with a completed checkbox, indicating no individuals to report, contributed to a complete response for the unit. Tables with unchanged, prefilled zeros and a blank checkbox disqualified the unit from complete response status.
In 2007 units that had only complete row and column totals for all three tables were counted as partial respondents. As in prior years, units that reported only grand totals for all three tables were counted as partial respondents. In 2007 an allowance was made for units that provided complete or partial data for at least one (but not all) of the three tables. These units were counted as partials.
These new response rate calculations adhere to American Association for Public Opinion Research (AAPOR) standards for computing response rates.
In 2007, the GSS received complete responses from 11,020 (87.3%) of the 12,629 eligible units. An additional 1,290 units (10.2%) were considered partial respondents. The remaining 319 units (2.5%) were classified as nonrespondents.
New data collection procedures introduced in the 2007 survey cycle (see "Survey Instrument and Procedures") appear to have greatly improved coverage at the reporting unit level. In the 2007 survey cycle, 1,273 units were added, as compared with 328 units in 2006. The dramatic increase in the number of units added in the 2007 data collection suggests that there was undercoverage of GSS-eligible units in previous survey years.
Retrieval and Editing
Minimal post-data collection editing was required for the 2007 data because the redesigned Web survey system yielded fewer errors in submitted data:
Item Nonresponse and Imputation
Of the 201 items collected in the 2007 GSS, the mean nonresponse rate was 6.6%. The item nonresponse rates ranged from 2.7% to 9.8%. All missing data were imputed.
Different imputation techniques were used for units with and without reported data in the last 5 years. For units with at least 1 year of reported data in the last 5 years, a carry-forward imputation method was used. Inflation factors were calculated for four key totals to account for year-to-year change. The previous year's key totals were then multiplied by these inflation factors to calculate the imputed values for the current year's key totals. Finally, all other variables were imputed by distributing the imputed key totals according to the previous year's proportions.
For units with no reported data in the last 5 years, a nearest neighbor imputation method was used. A donor was identified if it was in the same field as the unit for which data were to be imputed and had the closest number of graduate-level completions reported in the Integrated Postsecondary Education Data System (IPEDS) completions survey. The imputed values were calculated by adjusting the donor's values to account for the difference in the number of graduate-level completions between the two units.
Known or Suspected Sources of Nonsampling Error
Cognitive interviews, site visits, and other communications with school coordinators and unit respondents have pointed to a number of possible sources of measurement error. These are discussed below, along with steps taken to minimize their impact on the data.
First, although instructions emphasize that each individual should be enumerated only once, there is anecdotal evidence that some individuals have been counted twice by different school coordinators at the same institution or at institutions offering a joint program. In an attempt to prevent double counting, the Coordinator Contact Information screen in the 2007 Web survey provided names and contact information for all school coordinators at the institution.
Data on the race and ethnicity of graduate students also appears to be subject to some measurement error. The Office of Management and Budget standards treat Hispanics as an ethnic group rather than a racial group. Following these standards, "Hispanic" is not supposed to be counted as a race in GSS. Cognitive interviews with respondents have revealed that this is a source of confusion and may lead to nonsampling error.
Types of support that are not channeled through the institution, such as self-support, may be underreported. Foreign sources of support may not always be known. School coordinators and unit respondents may also have difficulty breaking down financial information by field, such as when a student is enrolled in one unit but receives support from another. Finally, institutions define mechanisms of support differently (e.g., fellowships vs. traineeships) and may report students according to the institution's definition rather than the definition provided by GSS.
In the 2007 survey cycle, some unit respondents provided notes indicating that although their units did have postdocs, they were unable to provide data for them. This reinforced reports from site visits, cognitive interviews, and other correspondence about the difficulty of providing this information.
Changes in Eligibility and Degree-Granting Status
Institutions are classified as doctorate-granting if at least one GSS-eligible unit confers doctorate degrees. Sixteen institutions changed GSS degree-granting status in 2007. The status of eight institutions or schools changed from eligible to ineligible, based on criteria for inclusion in the GSS (see "Survey Universe," above).
Status changed to doctorate-granting from master's-granting, 9 institutions:
Status changed to master's-granting from doctorate-granting, 7 institutions:
Status changed from eligible to ineligible, 8 institutions/schools:
Institution Name Changes, Mergers, and Joint Programs
A number of institutions reported name changes in 2007.
The Medical College of Ohio merged on 1 July 2006 with the University of Toledo. Units previously reported for the Medical College of Ohio are now reported under the University of Toledo.
In 2007 it was discovered that the Massachusetts Institute of Technology (MIT) and the Woods Hole Oceanographic Institution had both previously reported on graduate students in a joint degree-granting program. Personnel at the institutions agreed that only MIT would report on these students in 2007 and in the future.
With the 2007 DSTs, the GSS discontinued the practice of revising previous years' data based on changes in units' eligibility and institutions' doctorate-granting status in the current survey cycle. Previously, reported counts for a given year fluctuated with each annual report because the current year's eligibility and doctorate-granting status were applied retrospectively to all years in the tables. Except in table 68, counts in the data tables for 2001–06 reflect eligibility and doctorate-granting status as of fall 2006; they have not been adjusted to reflect changes in status that occurred between fall 2006 and fall 2007.
Table 68 historically has listed and ranked each institution that was doctorate-granting in the current survey cycle regardless of doctoral-degree-granting status or eligibility in previous years. Institutions that became ineligible were unranked at the end of the table, and eligible master's-granting institutions were not displayed. These rules have been continued in 2007. Thus, in table 68, data in years 2001–06 are counts of graduate students in those institutions that were doctorate-granting in 2007, and totals for 2001–06 in this table differ from totals for 2001–06 in other tables for doctorate-granting institutions in this report.
When requested by the institution, GSS will replace imputed estimates with actual data, but only for the prior survey cycle. During the 2007 GSS survey cycle, one academic institution requested that counts of postdocs and nonfaculty researchers that had been imputed in 2006 be replaced with actual data that had become available. These revisions account for differences between the 2006 and 2007 data tables of reported counts of 2006 postdocs and nonfaculty researchers.
Data collected in 2007 included demographic and funding information for graduate students, postdoctoral appointees, and doctorate-holding nonfaculty researchers. Definitions of key terms follow.
First-time—First-time graduate students are those who have enrolled for graduate credit at the institution at which they are pursuing a degree for the first time in the fall 2007 term.
American Indian or Alaska Native—A person having origins in any of the original peoples of North and South America (including Central America) and who maintains tribal affiliation or community attachment.
Asian—A person having origins in any of the original peoples of the Far East, Southeast Asia, or the Indian subcontinent, including, for example, Cambodia, China, India, Japan, Korea, Malaysia, Pakistan, the Philippine Islands, Thailand, and Vietnam.
Black or African American—A person having origins in any of the black racial groups of Africa.
Native Hawaiian or Pacific Islander—A person having origins in any of the original peoples of Hawaii, Guam, Samoa, or other Pacific islands.
White—A person having origins in any of the original peoples of Europe, the Middle East, or North Africa.
Hispanic or Latino—A person of Cuban, Mexican, Puerto Rican, South or Central American, or other Spanish culture or origin, regardless of race.
Graduate Student Mechanisms of Support
Graduate traineeship—An educational award given to a student selected by the institution.
Graduate research assistantship—An assistantship where most of the student's responsibilities are devoted to research.
Graduate teaching assistantship—An assistantship where most of the student's responsibilities are devoted to teaching.
Other types of support—All other mechanisms of support for full-time students, including self-supported students and members of the armed forces whose tuition is paid by the U.S. Department of Defense.
Postdoctoral Appointees (postdocs)
(1) Holds a recent doctoral degree, generally awarded within the last 5 years, such as
(2) Has a limited-term appointment, generally from 5 to 7 years
Postdoctoral mechanisms of support:
Federal traineeship—An educational award from the U.S. government given to a postdoc selected by the institution.
Federal research grant—A type of financial assistance award from the U.S. government to an organization or individual to conduct specific research activities.
Nonfederal support—Support from state and local government; the academic institution; foreign sources (e.g., foreign governments, foreign firms, and agencies of the United Nations); and other U.S. sources, such as support from nonprofit institutions, private industry, and all other nonfederal U.S. sources.
Doctorate-Holding Nonfaculty Researchers
Changes have been made to the coverage and content of GSS to keep it relevant to the needs of data users. Such changes prevent precise maintenance of trend data; therefore, some data items are not available for all institutions in all years. Major changes in the data collected (with the year in which changes became effective) include the following.
Graduate Student Support
Postocs and Doctorate-Holding Nonfaculty Researchers
Survey UniverseInstitutions Surveyed
NSF releases the data from this survey annually in Graduate Students and Postdoctorates in Science and Engineering and includes information from this survey in the Division of Science Resources Statistics (SRS) publications Science and Engineering Indicators and Women, Minorities, and Persons With Disabilities in Science and Engineering. NSF includes selected data items from this survey for individual doctorate-granting institutions in SRS's Academic Institutional Profiles series (http://www.nsf.gov/statistics/profiles/).
Data from this survey are available through the WebCASPAR data system. Public use data files in Excel, SAS, and SPSS formats are available for the years 1972–2007 at http://www.nsf.gov/statistics/srvygradpostdoc/pub_data.cfm. The guide to public use data files is available at http://www.nsf.gov/statistics/srvygradpostdoc/data07/guide2007.doc.
The GSS 2007 public use data structure was modified as compared with the GSS 2006 public use data structure. Significant changes include dropping the multi-record structure at the organizational unit level and combining all information associated with the organizational unit into a single-record-per-unit structure. Another notable addition is the inclusion of the IPEDS UNITID to facilitate linkages to other data files. For more information, see the guide to public use data files available at http://www.nsf.gov/statistics/srvygradpostdoc/data07/guide2007.doc.
 The research doctorate is a research degree that (1) requires an original contribution of knowledge to a field (typically, but not always, in the form of a written dissertation), and (2) is not primarily intended for the practice of a profession. For additional survey information and available data related to graduate student enrollment and postdoctoral appointees in S&E, see http://www.nsf.gov/statistics/srvygradpostdoc/.
 In this report, the term "school" refers to a graduate school, medical school, dental school, nursing school, or school of public health; an affiliated research center; a branch campus; or any other organizational component within an academic institution that grants an S&E or selected health degree, appoints postdocs, or employs doctorate-holding nonfaculty researchers.
 See response rate 3 calculation, page 35, in American Association for Public Opinion Research (AAPOR). 2008. Standard Definitions: Final Dispositions of Case Codes and Outcome Rates for Surveys. 5th ed. Lenexa, KS: AAPOR.