Comparison of the National Science Foundation's Scientists and Engineers Statistical Data System (SESTAT) with the Bureau of Labor Statistics' Current Population Survey (CPS)
The Division of Science Resources Statistics (SRS) of the National Science Foundation (NSF) initiated this report in response to requests from data users for information on scientists and engineers in the United States without a bachelor's or higher degree. Before making a decision to conduct a survey to collect this information, SRS reviewed existing data sources. During the review, the Current Population Survey (CPS) was identified as the most current data source with adequate coverage of the population of interest. CPS is a monthly labor force survey conducted by the U.S. Census Bureau and sponsored by the Bureau of Labor Statistics (BLS). This report provides the basis for understanding how CPS data may be used to satisfy the information needs of data users desiring information on the science and engineering (S&E) workforce without a bachelor's or higher degree. This information will complement the information on the S&E workforce with education at the bachelor's level and higher that is provided by the NSF's Scientists and Engineers Statistical Data System (SESTAT).
SESTAT is a data system that includes the employment, educational, and demographic characteristics of a sample of scientists and engineers in the United States. SESTAT, which NSF maintains to provide data for policy analysis and general research, is usually updated every 2 years. SESTAT's definition of scientists and engineers is restricted to individuals age 75 or younger with a bachelor's or higher degree living in the United States. It includes two groups: (1) individuals with a bachelor's or higher degree in S&E and (2) individuals with a bachelor's or higher non-S&E degree who are working in S&E occupations.
CPS provides an alternative source of information about scientists and engineers. CPS can identify individuals with degrees by degree level; however, it does not collect data on field of degree and therefore cannot distinguish between S&E and non-S&E degrees. Because CPS collects data on occupation, it can identify individuals working in S&E occupations. Thus, CPS and SESTAT can both provide estimates of individuals with at least a bachelor's degree who are working in S&E occupations.
Before endorsing the use of CPS data for estimates of the S&E workforce without a bachelor's or higher degree, SRS wanted to investigate the comparability of CPS data to SESTAT data where the coverage in the two survey systems overlaps. Therefore, the first purpose of this report is to compare SESTAT and CPS estimates of the S&E workforce with a bachelor's or higher degree and to try to account for any differences observed. Such differences may be attributed to the different coverage of the two survey systems, conceptual differences in the definitions used, nonresponse, and response effects. The comparisons between SESTAT and CPS estimates are presented in the section "Coverage Issues." The second purpose of this report is to provide estimates of the numbers of individuals without a bachelor's or higher degree who are working in S&E occupations. The results for this group are presented in the section "Comparison of Estimates." As background, the SESTAT and CPS designs are briefly reviewed in the sections "Overview of SESTAT Design" and "Overview of CPS Design."
Overview of SESTAT Design
The SESTAT target population includes individuals living in the United States who have a bachelor's or higher degree and were either educated in S&E or are working in an S&E occupation, with the exception of those individuals who are either institutionalized or age 76 and older. The broad degree and occupation categories considered as S&E include computer and mathematical science, life science, physical science, social science (including psychology), and engineering (see Kannankutty and Wilkinson 1999 for more information about the definition of S&E degree fields and S&E occupations).
The SESTAT data system is derived from three distinct survey components: the National Survey of College Graduates (NSCG), the National Survey of Recent College Graduates (NSRCG), and the Survey of Doctorate Recipients (SDR), which are explained below.
Overview of CPS Design
CPS is a monthly survey of about 50,000 households. It is based on a stratified, multistage area probability sample design and is the primary source of information on the labor force characteristics of the U.S. civilian noninstitutional population. In addition to information about employment status, earnings, hours of work, and other labor force characteristics, CPS collects educational attainment data and a variety of demographic characteristics such as age, sex, race/ethnicity, and marital status. Data are also available by occupation, industry, and class of worker. Since the inception of the survey, various changes have occurred in the design of the CPS sample. The survey is traditionally redesigned after each decennial census. The current sample design, introduced in January 1996, includes about 59,000 households from 754 sample areas. The number of eligible households in any given month is typically about 50,000; of these, about 93% respond to the survey. Data are generally collected for about 120,000 individuals of all ages from the responding households each month.
CPS uses a 4-8-4 rotation scheme in which each sampled household is interviewed for 4 consecutive months, then dropped out of the sample for the next 8 months, and finally brought back into the sample the following 4 months. A feature of the rotation scheme is that in any given month, about one-eighth of the households are first-time households and one-eighth are reactivated households after their 8-month resting period. The remaining households have been in the sample for 2 or more consecutive months. Thus, the household sample has roughly a 75% month-to-month overlap. Although accumulating the monthly CPS samples will increase the total sample size, the gains are limited because of the substantial overlap resulting from the 4-8-4 rotation scheme. The number of unique households in the CPS sample in a year is about three times the size of a typical monthly sample.
A summary of key differences between the SESTAT and CPS designs is presented in the section "Summary and Conclusions."
Organization of the Report
The remainder of this report is organized as follows:
 For the purposes of this report, the S&E workforce is defined as people working in SESTAT S&E occupations. SESTAT defines S&E occupations as computer and mathematical scientists, life scientists, physical scientists, social scientists (including psychologists), and engineers.