Separator.

Survey Info

Summary

The SDR provides demographic, education, and career history information from individuals with a U.S. research doctoral degree in a science, engineering, or health field. The SDR is sponsored by the National Center for Science and Engineering Statistics within the National Science Foundation and by the National Institutes of Health. Conducted since 1973, the SDR is a unique source of information about the educational and occupational achievements and career movement of U.S.-trained doctoral scientists and engineers in the United States and abroad.

Areas of Interest

Survey Administration

Westat was the data collection contractor for the 2021 SDR.

Survey Details

Status Active
Frequency Biennial
Reference Period The week of 1 February 2021
Next Release Date TBD

Methodology

Survey Description

Survey Overview (2021 Survey Cycle)

Purpose

The Survey of Doctorate Recipients (SDR), conducted by the National Center for Science and Engineering Statistics (NCSES) within the National Science Foundation, provides data on the characteristics of science, engineering, and health (SEH) doctorate degree holders. It samples individuals who have earned an SEH research doctoral degree from a U.S. academic institution and are less than 76 years of age. The SDR provides data useful in assessing the supply and characteristics of U.S.-trained SEH doctorates employed in educational institutions, private industry, professional organizations, and government in the United States, as well as in other countries worldwide.

Data collection authority

The information is solicited under the authority of the National Science Foundation Act of 1950, as amended, the America COMPETES Reauthorization Act of 2010, and the Confidential Information Protection and Statistical Efficiency Act of 2018. The Office of Management and Budget control number is 3145-0020.

Major changes to recent survey cycle

The 2021 SDR made two types of changes to the data collection instruments. First, for all modes of data collection, the survey included new questions to gauge the effects of the coronavirus pandemic on employment, specifically on labor force status, number of hours worked per week, salary, benefits, telecommuting options, and total earned income. The second change applied to the electronic instruments only. The Web and computer-assisted telephone interview (CATI) instruments included dependent interviewing (DI) methods for a targeted number of items within the employment question series to reduce respondent burden.

Key Survey Information

Frequency

Biennial.

Initial survey year

1973.

Reference period

The week of 1 February 2021.

Response unit

Individuals with an SEH research doctorate degree from a U.S. academic institution.

Sample or census

Sample.

Population size

Approximately 1,185,700 individuals.

Sample size

A total of 125,938 individuals.

Key variables
  • Demographics (e.g., age, race, sex, ethnicity, and citizenship)
  • Educational history
  • Employment status
  • Field of degree
  • Occupation

Survey Design

Target population

The SDR target population includes individuals that meet the following criteria:

  • Earned an SEH research doctoral degree from a U.S. academic institution prior to 1 July 2019
  • Were not institutionalized or terminally ill on 1 February 2021
  • Were less than 76 years of age as of 1 February 2021
Sampling frame

The Doctorate Records File (DRF) constructed from the annual Survey of Earned Doctorates, which is a census survey of all recipients of U.S. research doctoral degrees.

Sample design

The SDR uses a fixed panel design with a sample of new doctoral graduates added to the panel in each biennial survey cycle. For the 2021 SDR, all 2019 sample members who remained age eligible were retained for the 2021 cycle. As with prior survey cycles, a sample of 10,000 new graduates who had earned their degrees since the last SDR survey cycle, from 1 July 2017 to 30 June 2019, was added. The new graduates sample design followed the same sample design and sample stratification first introduced in 2019, defined by detailed fields of study, gender, and underrepresented minority status.

Data Collection and Data Processing

Data collection

The SDR uses a trimodal data collection approach: self-administered online survey, self-administered paper questionnaire (via mail), and CATI.

Data processing

The data collected in the SDR are subject to both editing and imputation procedures. The SDR uses both logical imputation and statistical (hot-deck) imputation as part of the data processing effort.

Estimation techniques

Because the SDR is based on a complex sampling design and subject to nonresponse bias, sampling weights are created for each respondent to support unbiased population estimates. The final analysis weights account for the following:

  • Differential sampling rates
  • Adjustments for unknown eligibility
  • Adjustments for nonresponse
  • Adjustments to align the sample distribution with the population distribution with respect to gender, race and ethnicity, location, degree year, and degree field data of the DRF.

Survey Quality Measures

Sampling error

Estimates of sampling errors associated with this survey were calculated using replicate weights.

Coverage error

Any missed doctoral graduates within the DRF derived from the SED would create undercoverage in the SDR. Reporting errors in the SED could lead to incorrect classification of doctorates as having or not having earned an SEH research doctorate, which could result in either overcoverage or undercoverage.

Nonresponse error

The weighted and unweighted response rates for the 2021 SDR were each 65%. Analyses of SDR nonresponse trends were used to develop nonresponse weighting adjustments to minimize the potential for nonresponse bias in the SDR estimates. A hot-deck imputation method was used to compensate for item nonresponse.

Measurement error

The SDR is subject to reporting errors from differences in interpretation of questions. Although three modes of response were offered (Web, mail, and CATI), 99% of sample members chose to respond via the Web instrument. As such, reporting error due to mode differences is significantly diminished.

Data Availability and Comparability

Data availability

Data from 1993 to present are available at the SDR website, https://www.nsf.gov/statistics/srvydoctoratework/.

Data comparability

Year-to-year comparisons can be made among the 1993 to 2021 survey cycles because many of the core questions remained the same. Small but notable differences exist across some survey years, such as the collection of occupation data based on more recent versions of the occupation taxonomy. Also, the SDR target population definition has changed over time as follows:

  • Survey data prior to 2010 did not cover SEH doctorates residing outside of the United States.
  • In 2010 and 2013, full coverage of SEH doctorates residing outside of the United States included only those having graduated since 2001. For graduates from earlier years, the coverage of those residing outside of the United States is partial.
  • The 2015 SDR sample design improved population coverage in the 2015 and subsequent survey cycles to include all SEH doctorates awarded by U.S. institutions, regardless of the academic year of award or the recipient's post-graduation residency location.

Caution is recommended when interpreting or analyzing trends that span pre- and post-1991 surveys, pre- and post-2010 surveys, and pre- and post-2015 surveys given the noted changes in the survey design and target population.

  • Overlap in sample cases across survey cycles support longitudinal analysis using SDR data. A longitudinal panel representing a cohort of SEH doctorate recipients awarded their degree prior to July 2013 and aged less than 66 years in 2015 was selected, and an initial longitudinal data file with imputation and weights accurately reflecting the longitudinal design was developed. The 2015–19 survey data for this panel is available at Survey of Doctorate Recipients, Longitudinal Data: 2015–19.

Data Products

Publications

Data from the SDR are published in NCSES InfoBriefs and data tables, available at https://www.nsf.gov/statistics/srvydoctoratework/. Information from this survey is also included in Science and Engineering Indicators and Women, Minorities, and Persons with Disabilities in Science and Engineering.

Electronic access

The SDR public use data are available in the SESTAT data tool and in downloadable files through the NCSES data page. Access to restricted data for researchers interested in analyzing microdata can be arranged through a licensing agreement. For more information on licensing, see https://ncses.nsf.gov/about/licensing.

Note

1The Web and CATI instruments included DI methods for a targeted number of items within the employment question series. With DI, sample member responses from 2019 were preloaded into the 2021 SDR questionnaire and displayed for the respondent. For each of the DI questions, sample members first answered “yes” or “no” to indicate if the information displayed from their 2019 response still applied to the 2021 reference period. If not, the sample member provided updated information on the subsequent screen. Only sample members who participated in 2019 and reported working in both the 2019 and 2021 cycles were eligible for DI.

Survey Overview (2021 Survey Cycle)

Purpose. The Survey of Doctorate Recipients (SDR), conducted by the National Center for Science and Engineering Statistics (NCSES) within the National Science Foundation (NSF), provides data on the characteristics of science, engineering, and health (SEH) doctorate degree holders. A research doctorate is a doctoral degree that (1) requires the completion of an original intellectual contribution in the form of a dissertation or an equivalent culminating project (e.g., a published manuscript) and (2) is not primarily intended as a degree for the practice of a profession. The most common research doctorate degree is the PhD. The SDR samples individuals who have earned an SEH research doctorate from a U.S. academic institution and are younger than 76 years. The SDR provides data useful in assessing the supply and characteristics of the U.S.-trained SEH doctorates employed in educational institutions, private industry, professional organizations, and governments in the United States, as well as in other countries worldwide.

The SDR is designed to provide demographic, education, and career history information about individuals who earned a research doctorate in an SEH field from a U.S. academic institution. The SDR is closely related to another survey of scientists and engineers conducted by NCSES: the National Survey of College Graduates (NSCG, https://www.nsf.gov/statistics/srvygrads/). These two surveys share a common reference date, and they use similar questionnaires and data processing guidelines.

Some of the education and demographic information in the SDR come from the Survey of Earned Doctorates (SED, https://www.nsf.gov/statistics/srvydoctorates/), an annual census of research doctorates earned in the United States. The SED provides the sampling frame for the SDR through its annual update of the longstanding Doctorate Records File (DRF), a cumulative listing of all U.S.-earned doctorate recipients dating back to 1920.

These technical notes provide an overview of the 2021 SDR. Complete details are provided in the 2021 SDR Methodology Report, available upon request from the SDR Survey Manager.

Data collection authority. The information collected in the SDR is solicited under the authority of the National Science Foundation Act of 1950, as amended, the America COMPETES Reauthorization Act of 2010, and the Confidential Information Protection and Statistical Efficiency Act of 2002. The Office of Management and Budget control number is 3145-0020 and expires on 31 July 2024.

Survey contractor. Westat, Rockville, MD.

Survey sponsor. The SDR is sponsored by NCSES with support from the National Institutes of Health.

Major changes to the recent cycle. In 2021, NCSES introduced two changes to the SDR survey. First, NCSES added new content to capture the effects of the coronavirus pandemic on the doctoral-trained SEH workforce. The new content was intended to measure impacts on salary, income, labor force status, and benefits. As a result of these changes, the set of questions in the historical salary and income series reflect some modifications and should be considered in trend analysis using these variables. Please see the 2021 SDR Methodology Report for more details about the coronavirus-related questionnaire modifications.

Second, for the electronic modes of response, eligible sample members could respond to a targeted set of six employment items via a dependent interview approach. With dependent interviewing, the survey instrument displayed the unedited response from the 2019 cycle for the targeted survey questions and asked the sample member if that response was still correct as of the reference date (1 February 2021). If yes, the instrument moved to the next applicable survey question. If the sample member indicated the response was no longer correct as of the reference date, the instrument presented the traditional (nondependent interviewing) version of the same question for the respondent to answer. The paper version of the survey did not reflect dependent interviewing methods.

Key Survey Information

Frequency. Biennial.

Initial survey year. 1973.

Reference period. The week of 1 February 2021.

Response unit. Individuals with an SEH research doctorate from a U.S. academic institution.

Sample or census. Sample.

Population size. Approximately 1,185,700 individuals; 1,023,600 residing in the United States and 162,100 residing outside the United States.

Sample size. 125,938 individuals.

Key variables.

  • Demographics (e.g., age, race, sex, ethnicity, and citizenship)
  • Educational history
  • Employment status
  • Field of degree
  • Occupation

Survey Design

Target population. The SDR target population includes individuals that meet the following criteria:

  • Earned an SEH research doctorate from a U.S. academic institution prior to 1 July 2019.
  • Are not institutionalized or terminally ill on 1 February 2021.
  • Are less than 76 years of age as of 1 February 2021.

Sampling frame. The SDR uses the DRF, constructed from the annual SED, as its sampling frame. Based on the information available in the DRF, individuals who did not meet the age criterion were dropped from the frame. For individuals who completed more than one SEH research doctorate, only the information on the first degree earned was used for sampling eligibility.

Sample design. The SDR uses a fixed panel design with a sample of new doctoral graduates added to the panel in each biennial survey cycle. For the 2021 SDR, all 2019 sampled members who remained age eligible were retained for the 2021 cycle. As with prior survey cycles, a sample of 10,000 new graduates who had earned their degrees from 1 July 2017 to 30 June 2019 was added. As with the 2017 and 2019 survey cycles, the stratification cells defined by detailed fields of study, gender, and underrepresented minority indicator were used to select the new graduate sample.

The resulting 2021 SDR sample of 125,938 cases consisted of 115,938 age-eligible cases from the 2019 SDR and 10,000 cases from the new cohort of graduates from academic years 2018 and 2019. The overall sampling rate was about 1 in 10 (10.6%), although sampling rates varied across strata.

Data Collection and Processing Methods

Data collection. The data collection period for SDR has historically been 6 months, but NCSES decided to extend the 2021 data collection by one additional month, for a field period of 7 months in total. The SDR used a trimodal data collection approach: self-administered online survey (Web), self-administered paper questionnaire (via mail), and computer-assisted telephone interview (CATI). All individuals in the sample were started in the Web mode if a mail or e-mail address was available. After an initial survey invitation via postal mail and e-mail, the data collection protocol included sequential contacts by postal mail, telephone, and e-mail that ran throughout the data collection period. At any time during data collection, sample members could choose to complete the survey using any of the three modes. Nonrespondents to the initial survey invitation received follow-up with alternate survey modes.

Quality assurance procedures were in place at each data collection step (address updating, printing, package assembly and mailing, questionnaire receipt, data entry, coding, CATI, and post-data collection processing). Active data collection ended in February 2022. The online survey closed 28 February 2022, and receipt of hard-copy questionnaires ended on 2 March 2022.

Mode. Almost 99% of the participants completed the survey through the Web, 0.6% through mail, and 0.6% through CATI. Web participation increased from 93% in the 2019 cycle because of continued emphasis on Web-based participation in the starting phase of data collection.

Response rates. Response rates were calculated on complete responses, that is, from instruments with responses to all critical items. Critical items are those containing information needed to report labor force participation, including employment status, job title, and job description, as well as location of residency on the reference date. The overall unweighted response rate was 65%; the weighted response rate was also 65%. These response rates are about 3 percentage points lower than those achieved in the 2019 SDR.

Of the 125,938 persons in the 2021 SDR sample, 80,295 completed the survey. Among those who completed the survey, 71,213 respondents were residing in the United States on the survey reference date and contributed to the U.S. SEH doctoral population estimates. An additional 9,082 persons completed the survey, but they were residing outside of the United States on the survey reference date. This group contributed to the estimates of the internationally residing U.S.-trained SEH doctoral population.

Data editing. The Web and CATI SDR instruments were combined into a single code base for 2021, reducing mode differences and facilitating harmonization. Mail questionnaire data were scanned, and data were captured via Optical Mark Recognition (OMR) and Optical Characters Recognition (OCR). The OMR and OCR technology also applied editing controls that flagged unclear responses or responses that did not fit the expected response type (e.g., multiple responses in a select-one type question). Telephone callbacks were used to obtain additional information for incomplete mail responses. Responses from paper and electronic modes were merged into a single database and fully harmonized prior to the subsequent coding, editing, and cleaning needed to create an analytical database.

Following established NCSES guidelines for coding SDR survey data, including verbatim responses, staff were trained in conducting a standardized review and coding of occupation and education information, “other/specify” verbatim responses including verbatim items pertaining to the coronavirus modifications, state and country geographical information, and postsecondary institution information. For standardized coding of occupation, the respondent's reported job title, duties and responsibilities, and other work-related information from the questionnaire were reviewed by trained coders who corrected known respondent self-reporting errors to obtain the best occupation codes. The education code for the field of study of a newly earned degree or for the first bachelor's degree earned if not reported previously was assigned solely based on the verbatim response for that degree field.

Imputation. Item nonresponse for key employment items—such as employment status, sector of employment, and primary work activity—ranged from 0.0% to 1.8%. Nonresponse to questions about income was higher: nonresponse to salary was 10.5%, and nonresponse to earned income was 12.9%. Personal demographic data, such as sex, marital status, citizenship, ethnicity, and race, had variable item nonresponse rates, with sex at 0.0%, birth year at 0.2%, marital status at 8.0%, citizenship at 6.9%, ethnicity at 0.1%, and race at 0.5%. Item nonresponse was addressed using random imputation and hot-deck imputation methods.

Logical imputation often was accomplished as a part of editing. In the editing phase, the answer to a question with missing data was sometimes determined by the answer to another question. In some circumstances, editing procedures found inconsistent data that were blanked out and therefore subject to statistical imputation. During sample frame construction for the SDR, some missing demographic variables, such as race and ethnicity, were imputed before sample selection by using other existing information from the sampling frame. All sample members with imputed values for race or ethnicity were given the opportunity to report these data if they responded in the Web or CATI modes.

Respondents with missing race or ethnicity data who did not take the opportunity to report these data were assigned values for race or ethnicity through hot-deck procedures during post-data processing.

Most SDR variables were subjected to hot-deck imputation, with each variable having its own class and sort variables chosen by regression modeling to identify nearest neighbors for imputed information.

However, imputation was not performed on verbatim-based variables. For some variables, there was no set of class and sort variables that was reliably related to or suitable for predicting the missing value, such as day of birth. In these instances, random imputation was used, so that the distribution of imputed values was similar to the distribution of reported values without using class or sort variables.

Weighting. Because the SDR is based on a complex sampling design and subject to nonresponse bias, sampling weights were created for each respondent to support unbiased population estimates. The final analysis weights account for the following:

  • Differential sampling rates
  • Adjustments for unknown eligibility
  • Adjustments for nonresponse among eligible sample members
  • Adjustments to align the sample distribution with the population distribution with respect to gender, race and ethnicity, degree year, degree field, U.S. citizenship status, post-graduation location, and birthplace.

The final sample weights enable data users to derive survey-based estimates of the SDR target population. The variable name on the SDR public use data files for the SDR final sample weight is WTSURVY.

Detailed information on weighting is contained in the 2021 SDR Methodology Report, available upon request from the SDR Survey Manager.

Variance estimation. The successive difference replication method (SDRM) was used to develop replicate weights for variance estimation. The theoretical basis for the SDRM is described in Wolter (1984) and in Fay and Train (1995). As with any replication method, successive difference replication involves constructing a number of subsamples (replicates) from the full sample and computing the statistic of interest for each replicate. The mean square error of the replicate estimates around their corresponding full sample estimate provides an estimate of the sampling variance of the statistic of interest. The 2021 SDR produced 104 sets of replicate weights. Please contact the SDR Survey Manager to obtain the SDR replicate weights and the replicate weight user guide.

Disclosure protection. To protect against the disclosure of confidential information provided by SDR respondents, the estimates presented in SDR data tables are rounded to the nearest 50, although calculations of percentages are based on unrounded estimates.

Data table cell values based on counts of respondents that fall below a predetermined threshold are deemed to be sensitive to potential disclosure, and the letter “D” indicates this type of suppression in a table cell.

Survey Quality Measures

Sampling error. SDR estimates are subject to sampling errors. Estimates of sampling errors associated with this survey were calculated using replicate weights and are included in each table of estimates. Data table estimates with coefficient of variation (that is, the estimate divided by the standard error) that exceed a predetermined threshold are deemed unreliable and are suppressed. The letter “S” indicates this type of suppression in a table cell.

Coverage error. Coverage error occurs in sample estimates when the sampling frame does not accurately represent the target population and is a type of nonsampling error. The initial SDR sampling frame is the DRF which is derived from the SED, a census survey of research doctorates awarded annually in the United States. To the extent that the DRF does not include all awarded research doctorates, the SDR would suffer from undercoverage. Reporting errors in the SED could lead to incorrect classification of doctorates as having or not having earned an SEH research doctorate, which could result in either overcoverage or undercoverage.

Nonresponse error. The weighted and unweighted response rates for the 2021 SDR were each 65%. Results from the research and analysis of SDR nonresponse trends have been used in the development of the nonresponse weighting adjustments to minimize the potential for nonresponse bias in the SDR estimates. In addition, as noted above, most item nonresponse was addressed using hot-deck imputation methods and random imputation for a few items when applicable.

Measurement error. The SDR is subject to reporting errors from differences in interpretation of questions and by modality (Web, mail, and CATI).

Data Comparability and Changes

Data comparability. Year-to-year comparisons can be made among the 1993 to 2021 survey cycles because many of the core questions remained the same. Small but notable differences exist across some survey cycles, however, such as the collection of occupation data being based on the different versions of the occupation taxonomy. Also, due to variation in the month of the reference date in some survey cycles, seasonal differences may occur when making comparisons across cycles and decades. Thus, use caution when interpreting cross-cycle and cross-decade comparisons. In addition, the definition of the SDR survey target population has experienced the following changes over time:

  • Starting in the 2015 SDR, sample design improved population coverage to include all SEH doctorates awarded by U.S. institutions regardless of the academic year of award or the graduate’s post-graduation residency location.
  • In 2010 and 2013, coverage of SEH doctorates residing outside of the United States included those having graduated since 2001.
  • Surveys conducted prior to 2010 did not cover SEH doctorates residing outside of the United States.
  • From 1999 to 2008, estimates of industrial engineers were mislabeled as estimates of “Materials/metallurgical engineers.” For these years, data in this mislabeled category included only industrial engineers, and estimates of Materials/metallurgical engineers were included in the estimate of “Other engineers.”

Caution is recommended when considering any analysis of trends that span pre- and post-1991 surveys, pre- and post-2010 surveys, and pre- and post-2015 surveys because of the changes in the survey design and target population.

Overlap in sample cases across survey cycles allows for longitudinal analysis using SDR data. To link cases on the SDR public use data files across survey cycles, use the unique identification variable REFID.

Changes in survey coverage and population.

  • 2015. Beginning with the 2015 SDR and continuing with the 2017 and 2019 cycles, the SDR maintains a consistent target population that includes doctorate recipients residing outside the United States. The 2015 cycle introduced a fresh sample selected from the DRF and sampling strata defined by fine field of degree. Through these changes introduced in the 2015 SDR survey cycle, the 2015 sample represents all U.S.-trained doctorate holders with a first SEH degree regardless of their citizenship or plans to leave the United States upon graduation, which were eligibility delimiters in past cycles of the SDR. To analyze U.S.-residing cases only, use the variable FNINUS, which indicates living or working in the United States on the survey reference date.

Changes in data processing.

  • 2019. Updates to improve the accuracy of post-collection processing resulted in shifts to two estimates. Specifically, as a result of an update to an edit, the estimate of the proportion of the population employed on the reference day in both the current cycle and in the prior cycle (WRKGP) increased relative to 2017 and 2015. In 2019, the edit for missing responses to this item was updated to evaluate current cycle working status as well as refer to the working status reported in the prior cycle. Previously, the edits do not refer to prior cycle response data. As a result of modification to an item specific imputation approach, the distribution of changes in employer and type of job (EMSMI) between the 2019 cycle and the previous cycle shifted for those working in both cycles. The modification removed a constraint that limited the eligible donor pool and resulted in differences in the distribution between non-imputed and imputed responses. The modified imputation approach applied in 2019 increased the similarity between the imputed response distribution and the non-imputed responses.

Changes in questionnaire.

  • 2021. The 2021 survey included two significant questionnaire changes. First, the 2021 SDR reflects modifications to some questions and the addition of new questions in order to collect information on how the coronavirus may have affected salary, income, labor force status, and benefits. Each of the COVID-19-related changes made to the 2021 SDR were also included in NCSES’ sister survey, the NSCG. The second change introduced in 2021 reflected a change in survey methodology. For the electronic modes of response, eligible sample members could respond to a targeted set of six employment items via a dependent interview approach. With dependent interviewing, the survey instrument displayed the unedited response from the 2019 cycle for the targeted survey questions and asked the sample member if that response was still correct as of the reference date (1 February 2021). If yes, the instrument moved to the next applicable survey question. If the sample member indicated the response was no longer correct as of the reference date, the instrument presented the traditional (nondependent) interviewing.
  • 2019. The 2019 questionnaire eliminated the question that asked respondents to provide their preferred mode of response. This question reflected an operational rather than analytic purpose. However, prior research showed that once respondents complete the survey online they are more likely to complete online in the future, regardless of stated preference. Similarly, respondents given the Web-start mode are more likely to complete on Web, regardless of past mode of completion.
  • 2017. The 2017 questionnaire changed the order of responses 9 and 10 to questionnaire item A13 (type of principal employer). Response 9 is “in a non-U.S. government at any level,” and response 10 is “Other—Specify type of employer”; these were in the reverse order in the 2015 questionnaire. For questionnaire item E9, “Were you a non-U.S. citizen…,” all 2017 survey forms included a third response option, “Who no longer held a U.S. Resident Visa.” The second response option in questionnaire item E18 (the future survey mode preference questions) was changed to “An online questionnaire” from “A web questionnaire on the Internet.”
  • 2015. The 2015 questionnaire differed from the 2013 questionnaire by adding “National Aeronautics and Space Administration (NASA)” as response category 6 to questionnaire item A43 (Federal agencies or departments supporting your work). “National Science Foundation (NSF)” became response category 7, “Other” became response category 8, and “Don’t know source agency” became response category 9. In addition, a new questionnaire item was added (E12) that included three questions to help verify information about the individual’s doctorate: (1) the institution granting the doctorate, (2) the field of study of the doctorate, and (3) the month and year it was granted.
  • 2013. The 2013 questionnaire differed from the 2010 questionnaire by splitting the first response category for the indicator of sample member location on the survey reference date into two categories. “United States, Puerto Rico, or another U.S. territory” became “United States or Puerto Rico” and “Another U.S. territory.”
  • 2010. The 2010 questionnaire differed from the 2008 questionnaire as follows. The module questions were dropped on respondents’ second jobs, patents, and publications. At the same time, the SDR reinstated from previous rounds’ questionnaires a module on enrollment and course taking at a college or university and also questionnaire items on components of job satisfaction, whether employer is a new business, importance of job benefits, membership in professional associations, attendance at professional conferences, and federal agencies supporting research work. Three new questionnaire items were added: year of tenure, year of retirement, and degree of difficulty concentrating, remembering, or making decisions.
  • 2008. The 2008 questionnaire included a module that gathered information on individual’s second job, as well as two sets of questions reinstated from the 2003 questionnaire: (1) questions measuring technical expertise required for the respondent’s and the respondent’s spouse’s primary job, and (2) questions measuring respondent’s research productivity (authorships or co- authorships of papers, articles, books, or monographs; number and type of patents earned). The 2006 modules on postdoctoral appointments and international collaboration were not included.
  • 2006. The 2006 questionnaire included a module on the history of postdoctoral appointments, awarded primarily for gaining additional education and training in research, as a follow-up to a similar module included in the 1995 SDR, in addition to a new module on international collaboration among doctorate recipients.

Changes in reporting procedures or classification.

  • 2021. In 2021, Taxonomy of Geographic Areas (TOGA) codes were added as new variables for all geographic items, though the variables reflecting the traditional SESTAT (Scientists and Engineers Statistical Data System) geographic codes remain as well.
  • 2017. The 2017 survey microdata includes both the former SDR field of study aggregations as well as the 77 new field of study aggregations based on the NCSES Taxonomy of Disciplines (TOD). The TOD has few minor differences in broader field aggregations compared to the traditional taxonomy used in past data tables.
  • 2015. Data tables reporting at the SED fine field of degree level have been added. Data tables that report on the non-U.S. residing population have been added consistent with the updated sample design that provides full coverage of the non-U.S. residing population.
  • 2010. Due to the inclusion and exclusion of certain module questions in the 2010 questionnaire compared to the 2008 questionnaire, there are some differences in 2010 data table availability compared with 2008.
  • 2003. Data on employed doctorate recipients were further classified to include a new category for science and engineering (S&E)-related occupations. This category includes health-related occupations, S&E managers, S&E precollege teachers, and S&E technicians and technologists.
  • 2002 and prior. Data on employed doctorate recipients were classified into two categories: employment in an S&E occupation, and employment in a non-S&E occupation.

Definitions

Employer location. Survey question A9 includes the location of the principal employer, and data were based primarily on responses to this question. Individuals not reporting place of employment were classified by their last mailing address.

Field of doctorate. The doctoral field is as specified by the respondent in the SED at the time of degree conferral. The more than 200 SED coded fields were subsequently recoded to the 77 field-of-study codes used in the SDR questionnaire. (See table A-1 for a list and cross-classification of the 77 SDR detailed fields of degree based on the TOD with over 200 fine fields of degree reported in the SED sampling frame.)

Full-time and part-time employment. Full-time (working 35 hours or more per week) and part-time (working less than 35 hours per week) employment status is for the principal job only and not for all jobs held in the labor force. For example, an individual could work part time in his or her principal job but full time in the labor force. Full-time and part-time employment status is not comparable to data reported before 2006, when no distinction was made between the principal job and the other jobs held by the individual.

Involuntarily out-of-field rate. Involuntarily out-of-field rate is the percentage of employed individuals who reported, for their principal job, working in an area not related to the first doctoral degree at least partially because a job in their doctoral field was not available.

Labor-force participation rate. The labor-force participation rate is the ratio (E + U) / P, where E (employed) + U (unemployed; not-employed and actively seeking work) = the total labor force, and P = population, defined as all noninstitutionalized SEH doctorate holders less than 76 years of age during the week of 1 February 2021 and who earned their doctorate from a U.S. institution.

Occupation data. The occupational classification of the respondent was based on his or her principal job (including job title) held during the reference week—or on his or her last job held, if not employed in the reference week (survey questions A5 and A6 as well as A19 and A20). Also used in the occupational classification was a respondent-selected job code (survey questions A7 and A21). (See table A-2 for a list and classification of occupations reported in the SDR.)

Race and ethnicity. Ethnicity is defined as Hispanic or Latino or not Hispanic or Latino. Values for those selecting a single race include American Indian or Alaska Native, Asian, Black or African American, Native Hawaiian or Other Pacific Islander, and White. Those persons who report more than one race and who are not of Hispanic or Latino ethnicity also have a separate value. Race and ethnicity data are from the SED and prior rounds of the SDR. The most recently reported race and ethnicity data are given precedence.

Salary. Median annual salaries are reported for the principal job, rounded to the nearest $1,000, and computed for full-time employed scientists and engineers. For individuals employed by educational institutions, no accommodation was made to convert academic year salaries to calendar year salaries. Users are advised that, due to changes in the salary question after 1993, salary data for 1995–2019 are not strictly comparable with 1993 salary data. In 2021, changes to the salary series allowed sample members to identify increases or decreases in their salary due to the coronavirus pandemic. Although the core salary question did not change, additional items were added that may influence how sample members responded to the salary item. Similar changes were implemented in the earnings series. Please see the 2021 SDR Methodology Report for more details regarding these changes.

Sector of employment. Employment sector is a derived variable based on responses to questionnaire items A13, A14, and A15. Questionnaire item A13 (type of principal employer) includes a separate response “In a non-U.S. government at any level” as of the 2015 survey. In the data tables, the category 4-year educational institutions includes 4-year colleges or universities, medical schools (including university-affiliated hospitals or medical centers), and university-affiliated research institutes. Other educational institutions includes 2-year colleges, community colleges, technical institutes, precollege institutions, and other educational institutions (which respondents reported verbatim in the survey questionnaire). Users should note that prior to 2008 these other educational institutions that were written as verbatim by respondents were grouped with 4-year educational institutions rather than with 2-year colleges. Private, for-profit includes respondents who were self-employed in an incorporated business. Self-employed includes respondents who were self-employed or were a business owner in a non-incorporated business.

Unemployment rate. The unemployment rate (RU) is the ratio U / (E + U), where U = unemployed (not employed and actively seeking work), and E (employed) + U = the total labor force.

References

Fay RE, Train GF. 1995. Aspects of survey and model-based postcensal estimation of income and poverty characteristics for states and counties. American Statistical Association Proceedings of the Section on Government Statistics 154–59.

Wolter K. 1984. An investigation of some estimators of variance for systematic sampling. Journal of the American Statistical Association 79(388):781–90.

Data

Doctoral scientists and engineers

Employed doctoral scientists and engineers

Occupations of doctoral scientists and engineers

Median annual salaries of full-time employed doctoral scientists and engineers

This report presents data from the 2021 Survey of Doctorate Recipients (SDR). The SDR is a biennial survey that collects longitudinal data on demographic and general employment characteristics of individuals who have received a research doctorate in a science, engineering, or health (SEH) field from a U.S. academic institution. Starting shortly after they receive their doctorate, sampled individuals are eligible for inclusion in the survey until they reach age 76. The SDR sample is augmented each cycle with new samples of the most recent cohorts of SEH doctorate recipients, identified by the Survey of Earned Doctorates, an annual census of research doctorates awarded in the United States. The 2021 questionnaire included new content to capture the effects of the coronavirus pandemic on U.S.-trained SEH doctorate holders.

The National Center for Science and Engineering Statistics within the National Science Foundation is the primary sponsor of the SDR, with additional funding provided by the National Institutes of Health.

The published tables provide information on doctoral scientists and engineers by field of doctorate and occupation; by demographic characteristics, such as sex, race, ethnicity, citizenship, and age; and by employment-related characteristics, such as sector of employment, median annual salary, and labor-force rates.

Data corrections

11 August 2023: In the Survey of Doctorate Recipients 2021 data tables reporting field of doctorate, the incorrect label “Industrial engineers” was corrected to "Metallurgical and materials engineering."

The following tables have been corrected:

Table 1-1

Table 1-2

Table 2

Table 4-1

Table 4-2

Table 4-3

Table 4-4

Table 5

Table 6

Table 7

Table 8

Table 9

Table 10

Table 11-1

Table 11-2

Table 12-1

Table 12-2

Table 12-3

Table 15-1

Table 15-2

Table 15-3

Table 15-4

Table 17

Table 20

Table 48

Table 49

Table 50

Table 51

Table 52

Table 53

Table 54

Table 57-1

Table 57-2

Table 59

Table 62

Table 75


Acknowledgments

Flora Lan of the National Center for Science and Engineering Statistics (NCSES) developed and coordinated this report under the leadership of Emilda B. Rivers, NCSES Director; Vipin Arora, former NCSES Deputy Director; John Finamore, NCSES Chief Statistician; and Gary Anderson, Acting NCSES Program Director. Wan-Ying Chang (NCSES) reviewed the report.

Under contract with NCSES, the Westat statistical team led by Shelley Brock compiled the tables in this report.

NCSES thanks the doctorate recipients for their generous time and effort in contributing to the information included in this report.

Suggested Citation

National Center for Science and Engineering Statistics (NCSES). 2023. Survey of Doctorate Recipients, 2021. NSF 23-319. Alexandria, VA: National Science Foundation. Available at https://ncses.nsf.gov/pubs/nsf23319.

Analysis

Survey Contact

For additional information about this survey or the methodology, contact

Lynn Milan
Survey Manager
Phone
(703) 292-2275
E-mail
lmilan@nsf.gov
Address
National Center for Science and Engineering Statistics, 2415 Eisenhower Avenue, Suite W14200, Alexandria, VA 22314