Graduate Students and Postdoctorates in Science and Engineering: Fall 2008
Appendix A. Technical Notes
During the production of this report the America COMPETES Reauthorization Act of 2010 was signed into law. Section 505 of the bill renames the Division of Science Resources Statistics as the National Center for Science and Engineering Statistics (NCSES). The Center retains its reporting line to the Directorate for Social, Behavioral and Economic Sciences within the National Science Foundation. The new name signals the central role of NCSES in the collection, interpretation, analysis, and dissemination of objective data on the science and engineering enterprise.
The Survey of Graduate Students and Postdoctorates in Science and Engineering (GSS) is an annual census of all known academic institutions in the United States that grant master's degrees or research doctorates, employ postdoctoral researchers (postdocs) or doctorate-holding nonfaculty researchers in science and engineering (S&E) fields and in selected health fields. The data collected in the 2008 GSS represent national estimates of graduate student enrollment and postdoctoral employment as of fall 2008.
In 2008 the survey universe consisted of 708 schools at 579 academic institutions: 505 schools at 376 doctorate-granting institutions and 203 schools at 203 master's-granting institutions. Data collected included demographic and funding information for graduate students and postdocs, and counts of doctorate-holding nonfaculty researchers by sex.
Table A-1 shows the number of institutions, schools, and organizational units (i.e., departments, degree-granting programs, research centers, and health facilities) by degree level covered by the GSS and shows estimated total annual enrollment in GSS-eligible fields between 1966 and 2008. Changes in the survey that affect comparability of these data are as follows:
Tables A-2 and A-3 show the number of units surveyed by field in doctorate-granting and master's-granting institutions. Table A-4 shows the unit response rates from 1975 through 2008. Tables A-5 through A-12 show imputed data and/or imputation rates for different categories.
Survey Instrument and Procedures
A Web survey system was the primary mode of 2008 data submission. The survey cycle was launched in October 2008 and concluded in June 2009.
The 2008 GSS Web survey consisted of two parts. Part 1 required the identification of organizational units ("units") within the reporting school. Part 1 could only be completed in the Web survey system.
Part 2 collected counts of graduate students, postdocs, and other doctorate-holding nonfaculty researchers. A paper worksheet was provided for preparing figures to later be entered in Part 2 of the Web survey. To assist with the transfer of information, the content and table format of the paper worksheet were identical to Part 2 of the Web survey. A small number of school coordinators chose to submit Part 2 data using the paper worksheet.
Institutions select a coordinator for each school that grants a graduate degree, or employs postdocs or doctorate-holding nonfaculty researchers in an eligible field. School coordinators for the GSS are responsible for the following:
Revisions Affecting Survey Universe
Units. The Web survey was redesigned in 2007 in an effort to include and appropriately classify all eligible units and to exclude ineligible units. See appendix A, "Technical Notes," in the 2007 report for more detail.
Fields of study and degree-granting programs. In 2007 a comprehensive review of GSS-eligible fields led to several changes to the classification scheme. GSS-eligible degree-granting programs were updated from the 1990 to the 2000 Classification of Instructional Programs (CIP) taxonomy of the National Center for Education Statistics. Degree-granting programs that had previously been represented by a four-digit CIP code are now represented at the six-digit level of specificity. Three newly eligible fields were added to the survey, some programs became ineligible, and others were reclassified. See the 2007 technical notes for more detail.
Due to these adjustments to the taxonomy and other methodological changes introduced in 2007, data for 2007 and 2008 are not directly comparable with data from previous years. For trend analyses, the data tables provide estimates of the counts that would have been collected in 2007 had the 2006 methodology been used (see "Bridge-Year Calculation and Display," below).
Revisions to Instructions and Definitions
Due to the rise in online degree programs, NSF received a number of questions about how to treat students who were enrolled in an online degree program but were not U.S. citizens, permanent residents holding green cards, or foreign nationals holding temporary visas. In response, NSF determined that non-U.S. citizens residing outside the United States who are enrolled in an online degree program at a U.S. institution are to be excluded.
Students doing thesis or dissertation research away from a U.S. campus were previously excluded, but they were included in the 2008 survey. The instructions read, "Count all students enrolled in a U.S. institution for credit in a graduate degree program doing thesis or dissertation research work regardless of their location."
Bridge-Year Data Calculation and Display
Due to the methodological changes introduced in 2007, including modifications to the set of GSS-eligible fields, most data tables provide data for 2007 in two ways: "2007old" and "2007new." Data shown under 2007old provide estimates of the counts that would have been collected in 2007 had the 2006 methodology been used. Counts reported under 2007new were collected using the methodology introduced in 2007.
To derive counts for 2007old, all units that were reported in the 2006 data collection and retained in 2007 were assigned the same GSS field as in 2006. This is consistent with the 2006 GSS coding because the Web survey system before 2007 did not have a direct mechanism for changing GSS codes, and very little recoding was done. Any new unit added in 2007 was given the GSS field code assigned to it, with the following exceptions:
The 2007old counts are based on a subset of the 2007 data due to the first exception listed above. A comparison of 2007old with 2007new data reflects differences due to the addition of the three newly added science fields and recoding of units from their 2006 fields to other fields.
The deadline for Part 1, the update of the unit list, was November 28, 2008. Schools that missed this Part 1 deadline received special attention from the survey contractor early in the survey cycle. The deadline for submitting data for Part 2 was February 27, 2009.
From 2004 through 2006, a unit was considered a complete respondent if it reported complete row and column totals in the data-collection tables and a partial respondent if it reported only grand totals for these tables. Any unit that did not meet the requirements for complete or partial respondent status was considered a nonrespondent. In 2007 and 2008 complete row and column totals for all tables were necessary for complete response status as well as all details summing to the totals. Units that had only complete row and column totals for all tables were counted as partial respondents. As in previous years, units that reported only grand totals for all tables were counted as partial respondents.
As in previous years, data tables in the Web survey were prefilled with zeros. Prior to the 2007 survey cycle, prefilled zeros were considered legitimate responses if the data table screen was visited and left with all zeros in place. In 2007 and 2008 a checkbox was placed above the data tables on each of these screens. The respondent was required to check this box to acknowledge explicitly that the unit had no individuals to report for that particular table, allowing true zeros to be distinguished from nonresponse for the table. Tables with a marked checkbox, indicating no individuals to report, contributed to a complete response for the unit. Tables with unchanged, prefilled zeros and a blank checkbox disqualified the unit from complete response status.
In the 2007 and 2008 survey cycles, an allowance was made for units that provided complete or partial data for at least one (but not all) of the tables. These units were counted as partial respondents.
These new response rate calculations adhere to American Association for Public Opinion Research (AAPOR) standards for computing response rates.
In 2008, the GSS received complete responses from 11,560 (87.8%) of the 13,166 eligible units. An additional 1,450 units (11.0%) were partial respondents. The remaining 156 units (1.2%) were nonrespondents.
New data collection procedures introduced in the 2007 survey cycle (see the 2007 technical notes) appear to have greatly improved inclusion of eligible units and exclusion of ineligible units. The number of unit additions increased over threefold from 2006 to 2007. School coordinators added about the same number of units in 2008 as in 2007. The number of units deleted more than doubled from 2006 to 2007. Although the number of units deleted in 2008 declined from 2007, school coordinators still removed significantly more units than in the 2006 survey cycle. The dramatic increase in the number of units added and deleted in the 2007 and 2008 data collections suggests that there was underreporting of GSS-eligible units and overreporting of ineligible units in previous survey years.
Retrieval and Editing
Minimal post-data collection editing was required for the 2008 data because the Web survey system yielded few errors in submitted data. Interactive edit checks ensured that counts provided were internally consistent and within an expected range based on the previous year's data. Unit respondents were asked to explain the discrepancy when counts were substantially different from the response provided in 2007. Data fluctuations that were not sufficiently explained during data collection were flagged for follow-up by telephone call to the school coordinator.
There were three other reasons a school's data were reviewed. Any school that had at least one unit with a comment indicating an error in the data (e.g., "We do not have access to race information, so we reported all students as white") was also deemed a candidate for retrieval and review. Any school that reported no graduate students was examined. If the evidence suggested that it was a unit that housed only postdocs or doctorate-holding nonfaculty researchers, then no follow-up was needed. Otherwise, it was identified for retrieval. The reasons for unit deletion provided by school coordinators were also reviewed. If the explanation suggested that the unit should not have been deleted (e.g., "I deleted this unit because I was unable to report numbers"), retrieval was required.
The data review and retrieval effort was more extensive in 2008 than it had been previously. As a result, 25% of the units (n = 3,332) were examined, up from 15% in 2007. Three-quarters of the schools (n = 531) either had units that underwent review or were investigated for one of the school-level issues, up from 68% in 2007. As a result of this review process, 14% of all eligible schools (n = 96) underwent retrieval for unit-level or school-level reasons, up from 6% in 2007. School coordinators at two-thirds of the schools recognized an error that was then corrected (n = 65). Revisions were made directly in the Web survey by the school coordinator, by unit respondents or GSS contractor staff at the direction of the school coordinator.
Item Nonresponse and Imputation
Of the 216 items collected in the four data-collection tables in the 2008 GSS, the mean item nonresponse rate was 5.0%. The item nonresponse rates ranged from 1.3% to 7.9%. All missing data were imputed.
Different imputation techniques were used for extant units and new units. For units with at least 1 year of reported or imputed data, a carry-forward imputation method was used. Inflation factors were calculated for four key totals to account for year-to-year change. The previous year's key totals were then multiplied by these inflation factors to calculate the imputed values for the current year's key totals. Finally, all other variables were imputed by distributing the imputed key totals according to the previous year's proportions. The same procedure was used in the 2007 imputations, with one exception. In 2007 the carry-forward method was used only if the unit reported data within the previous 5 years. This condition was lifted in 2008 because simulations using the 2007 data revealed that the carry-forward method performed better than other methods, even if the previous data were reported over 20 years ago.
When no reported or imputed data existed for a unit in a prior survey cycle, a different approach was needed. For new units with reported totals but no details in 2008, a nearest neighbor imputation method was used. In this method, a donor unit that was "nearest" to the unit whose data were being imputed (imputee) was identified among all responding units having similar characteristics as the imputee (such as having the same GSS code and offering a PhD degree). When graduate student details were being imputed, the nearest neighbor selected had full-time and part-time graduate enrollments that were most similar to the imputee's enrollments. When postdoc and doctorate-holding nonfaculty researcher details were being imputed, the total number of postdocs was used to choose the nearest neighbor. The imputed values were calculated by adjusting the donor's values to account for the difference in full-time and part-time enrollment totals between the two units.
In rare circumstances when no data were available from a new unit, IPEDS Completions and Enrollment data were used to estimate graduate student totals and details. This was a new approach instituted with the 2008 survey cycle based on research that demonstrated its superiority over a nearest neighbor method under these conditions. Because IPEDS does not collect data on postdocs and doctorate-holding nonfaculty researchers, a nearest neighbor was selected from the 2008 GSS data to estimate these counts.
Known or Suspected Sources of Nonsampling Error
Review of the data, cognitive interviews, usability tests, pilot tests, site visits, and other communications with school coordinators and unit respondents have pointed to a number of possible sources of measurement error. These are discussed below, along with any steps taken to minimize the impact on the data where applicable.
There may be overreporting of graduate students working towards practitioner degrees, particularly in health fields. Starting with the 2007 survey cycle, survey materials indicated that students pursuing master's, DDS, or MD degrees in 24 specified fields should be excluded. It is common for school coordinators to provide a comment explaining that they are deleting a unit because the degrees it offers are practitioner-based. This evidence suggests that measurement error has been reduced.
Although instructions emphasize that each individual should be enumerated only once, there is anecdotal evidence that some individuals have been counted twice by different school coordinators at the same institution or at institutions offering a joint program. In an attempt to prevent double counting, the Coordinator Contact Information screen in the Web survey provided names and contact information for all school coordinators at the institution to facilitate communication and allow sharing of data.
Prior to the 2008 survey cycle, cognitive interviews with respondents revealed that black Hispanics and white Hispanics were sometimes counted as "Hispanic—More than one race" rather than "Only one race—Hispanic." In 2008 these two Hispanic categories were collapsed into one, "Hispanic/Latino ethnicity (one or more races)." This proved easier for cognitive interview respondents to comprehend.
Also, increasing numbers of students are choosing not to report their race to their institution, leading to growth over time in the "Unknown/race not stated" GSS category. This leads to gradual declines in the proportion of students reported in some racial and ethnic groups. Because this is a social trend that is not unique to the GSS, steps have not been taken to minimize its impact on the data.
Data on financial support are sometimes difficult for school coordinators to collect and report accurately, in part because the information may not be stored in one centralized database for the institution. Also, types of support that are not channeled through the institution, such as self-support, may be underreported. Foreign sources of support may not always be known. School coordinators and unit respondents may also have difficulty categorizing financial information by field, such as when a student is enrolled in one unit but receives support from another. Finally, institutions define mechanisms of support differently (e.g., fellowships vs. traineeships) and may report students according to the institution's definition rather than the definition provided by the GSS. A recordkeeping study is being conducted to learn more about how such information is stored and accessed.
Some unit respondents report in the Web survey that although their unit has postdocs and/or doctorate-holding nonfaculty researchers, they are unable to provide data for them. A pilot study is being conducted to evaluate alternative collection procedures so that more complete and accurate data may be collected in the future.
Changes in Eligibility and Degree-Granting Status
Institutions are classified as doctorate-granting if at least one GSS-eligible unit confers doctoral degrees. Eight institutions changed GSS degree-granting status in 2008. The status of five institutions or schools changed from eligible to ineligible, based on criteria for inclusion in the GSS (see "Survey Universe," above).
Status changed to doctorate-granting from master's-granting, 2 institutions:
Status changed to master's-granting from doctorate-granting, 6 institutions:
Status changed from eligible to ineligible, 5 institutions/schools:
Institution Name Changes and Mergers
One institution reported a name change in 2008.
OGI School of Science and Engineering merged with Oregon Health and Science University (OHSU) in 2001. For data collection purposes, it became a school within OHSU in the 2008 survey cycle and was renamed the Department of Science and Engineering at Oregon Health and Science University.
With the 2007 detailed statistical tables (DSTs), the GSS discontinued the practice of revising previous years' data based on changes in units' eligibility and institutions' doctorate-granting status in the current survey cycle. Previously, reported counts for a given year fluctuated with each annual report because the current year's eligibility and doctorate-granting status were applied retrospectively to all years in the tables. Except in table 68, counts in the 2008 data tables for 2002–06 reflect eligibility and doctorate-granting status as of fall 2006; they have not been adjusted to reflect changes in status that occurred between fall 2006 and fall 2008.
Table 68 historically has listed and ranked each institution that was doctorate-granting in the current survey cycle regardless of doctoral-degree-granting status or eligibility in previous years. These rules have been continued in 2008. Thus, in table 68, data in years 2002–07 are counts of graduate students in those institutions that were doctorate-granting in 2008, and totals for 2002–07 in this table differ from totals for 2002–07 in other tables for doctorate-granting institutions in this report.
When requested by the institution, the GSS will replace imputed estimates with actual data, but only for the most recent prior survey cycle. No such requests were made in the 2008 survey cycle.
Data collected in 2008 included demographic and funding information for graduate students, postdocs, and doctorate-holding nonfaculty researchers. Definitions of key terms follow.
First-time—Those students enrolled for credit in a graduate-degree program in this organizational unit for the first time in fall 2008. This may include graduate students previously enrolled in another graduate degree program at your institution or at another institution. It may also include students that already hold another graduate or professional degree.
American Indian or Alaska Native—A person having origins in any of the original peoples of North and South America (including Central America) and who maintains tribal affiliation or community attachment.
Asian—A person having origins in any of the original peoples of the Far East, Southeast Asia, or the Indian subcontinent, including, for example, Cambodia, China, India, Japan, Korea, Malaysia, Pakistan, the Philippine Islands, Thailand, and Vietnam.
Black or African American—A person having origins in any of the black racial groups of Africa.
Native Hawaiian or Other Pacific Islander—A person having origins in any of the original peoples of Hawaii, Guam, Samoa, or other Pacific islands.
White—A person having origins in any of the original peoples of Europe, the Middle East, or North Africa.
Hispanic or Latino—A person of Cuban, Mexican, Puerto Rican, South or Central American, or other Spanish culture or origin, regardless of race.
Non-Hispanic/Latino, more than one race—Institutions report persons who indicate more than one race and are non-Hispanic into that category on the GSS form. The reports and DSTs combine multiracial non-Hispanics with those of unknown race because no more than 0.2% of graduate students are identified as such.
Although the survey forms began collecting Asian and Native Hawaiian/Other Pacific Islander data separately in 1999, reports and DSTs have continued to combine these categories as Asian/Other Pacific Islander because less than 0.5% of graduate students have been reported in the Native Hawaiian/Other Pacific Islander category.
From 1999 through 2007, the survey forms collected counts of Hispanics of one race separately from counts of Hispanics reporting two or more races. However, reports and DSTs in these years combined these data in a single Hispanic or Latino category because no more than 0.5% of graduate students were classified as multiracial Hispanics. In 2008 the survey forms combined these categories into a single Hispanic or Latino category.
Graduate Student Mechanisms of Support
Graduate traineeship—An educational award given to a student selected by the institution.
Graduate research assistantship—An assistantship where most of the student's responsibilities are devoted to research.
Graduate teaching assistantship—An assistantship where most of the student's responsibilities are devoted to teaching.
Other types of support—All other mechanisms of support for full-time students, including self-supported students and members of the armed forces whose tuition is paid by the U.S. Department of Defense.
Postdoctoral Researchers (postdocs)
Postdoc—An individual who meets both of the following qualifications:
(1) Holds a recent doctoral degree, generally awarded within the last 5 years, such as
(2) Has a limited-term appointment, generally from 5 to 7 years
Mechanisms of Postdoc Support
Federal fellowship—Any competitive award from the U.S. government (often from a national competition) that requires no work of the recipient.
Federal traineeship—An educational award from the U.S. government given to a postdoc selected by the institution.
Federal research grant—A type of financial assistance award from the U.S. government to an organization or individual to conduct specific research activities.
Nonfederal support—Support from state and local government; the academic institution; foreign sources (e.g., foreign governments, foreign firms, and agencies of the United Nations); and other U.S. sources, such as support from nonprofit institutions, private industry, and all other nonfederal U.S. sources.
Doctorate-Holding Nonfaculty Researchers
Doctorate-Holding Nonfaculty Researchers—All doctorate-holding researchers who (a) are not considered either postdocs or members of the faculty and (b) are involved principally in science and engineering or health research activities.
Changes have been made to the coverage and content of the GSS to keep it relevant to the needs of data users. Such changes prevent precise maintenance of trend data; therefore, some data items are not available for all institutions in all years. Major changes in the data collected (with the year in which changes became effective) include the following.
Graduate Student Support
Postdocs and Doctorate-Holding Nonfaculty Researchers
Survey UniverseInstitutions Surveyed
NSF releases the data from this survey annually in Graduate Students and Postdoctorates in Science and Engineering and includes information from this survey in the NCSES publications Science and Engineering Indicators and Women, Minorities, and Persons with Disabilities in Science and Engineering. NSF includes selected data items from this survey for individual doctorate-granting institutions in the NCSES Academic Institutional Profiles series.
Data from this survey are available through the WebCASPAR data system. Public-use data files in Excel, SAS, and SPSS formats are available for the years 1972–2008 at http://www.nsf.gov/statistics/srvygradpostdoc/pub_data.cfm. The guide to public-use data files is available at http://www.nsf.gov/statistics/srvygradpostdoc/data08/guide2008.pdf.
The GSS public-use data structure was modified in the 2007 survey cycle. Significant changes include dropping the multi-record structure at the organizational unit level and combining all information associated with the organizational unit into a single-record-per-unit structure. Another notable addition is the inclusion of the IPEDS UNITID, a unique number for all postsecondary institutions to facilitate linkages to other data files. For more information, see the guide to public-use data files.
 The research doctorate is a research degree that (1) requires an original contribution of knowledge to a field (typically, but not always, in the form of a written dissertation) and (2) is not primarily intended for the practice of a profession. For additional survey information and available data related to graduate student enrollment and postdocs in S&E, see http://www.nsf.gov/statistics/srvygradpostdoc/.
 In this report, the term "school" refers to a graduate school, medical school, dental school, nursing school, or school of public health; an affiliated research center; a branch campus; or any other organizational component within an academic institution that grants an S&E or selected health degree, employs postdocs or doctorate-holding nonfaculty researchers.
 See response rate 3 calculation, page 35, in American Association for Public Opinion Research (AAPOR). 2008. Standard Definitions: Final Dispositions of Case Codes and Outcome Rates for Surveys. 5th ed. Lenexa, KS: AAPOR.
 The Office of Management and Budget standards treat Hispanics as an ethnic group rather than a racial group. Following these standards, "Hispanic" is not supposed to be counted as a race in GSS. Cognitive interviews with respondents have revealed that this is a source of considerable confusion. For example, black Hispanics and white Hispanics may be counted as "Hispanic—More than one race" rather than "Only one race—Hispanic." In 2008 these two Hispanic categories were collapsed into one, "Hispanic/Latino ethnicity (one or more races)." The race/ethnicity categories were made to match the Integrated Postsecondary Education Data System (IPEDS) by combining the "Hispanic/Latino, more than one race" and "Hispanic/Latino, one race only" categories.