Current and Alternative Sources of Data on the Science and Engineering Workforce
SESTAT is a database created to enable study of scientists and engineers in the United States. The SESTAT database for the 1990s was designed in response to recommendations of a panel of the National Research Council's Committee on National Statistics. The panel's analyses and recommendations for the data system were presented in the 1989 report Surveying the Nation's Scientists and Engineers: A Data System for the 1990s. The report includes the following statement from the panel:
We strongly urge that the NSF personnel data system for the 1990s strive to provide information that will permit users to apply their own definitions of the science and engineering population to suit their particular research and analysis purposes within a framework that facilitates cross-comparison with other widely used data sources. Specifically, we believe that the system should support analysis of the science and engineering community from each of the two major perspectives… from the perspective of occupational employment or jobs and from the perspective of academic training or careers. (Citro and Kalton 1989:55–56)
The SESTAT target population includes individuals who, as of the survey reference period, had the following characteristics:
The degree fields considered to be S&E include computer and mathematical sciences, life sciences, physical sciences, social sciences (including psychology), and engineering. Occupational categories considered to be S&E include computer and mathematical scientists, life scientists, physical scientists, social scientists (including psychologists), and engineers.
The SESTAT database is created by integrating the NSCG, the NSRCG, and the SDR. The 1990 decennial census long form sample was used as a screening device for obtaining a sample of scientists and engineers for the NSCG. The NSCG was first administered in 1993 to a nationally representative sample of all college degree holders who were identified through the 1990 decennial census. Because information on degree field was not available on the census long form, the sample included individuals in the United States with a bachelor's degree or higher in any field, not just in science or engineering, as of April 1990. The sample members of the 1993 NSCG who had S&E degrees in April 1990 and/or who were working in S&E occupations in 1993 constituted the NSCG S&E panel, which was followed in subsequent rounds of the NSCG in 1995, 1997, and 1999.
The NSRCG has been administered biennially since the early 1970s to recent U.S. bachelor's and master's degree recipients in S&E disciplines. The NSRCG employs a two-stage sample design. First, a sample of institutions that grant S&E degrees was selected and asked to provide lists of their relevant graduates in the 2 academic years before the survey (except in the case of the 1993 NSRCG, which covered graduates in spring 1990 as well as in the 1991 and 1992 academic years). Second, samples of graduates with bachelor's and master's degrees in S&E fields were selected from the lists for inclusion in the NSRCG. The samples of graduates selected in each round of the NSRCG are eligible to be sampled and then become part of the NSCG in the next round of SESTAT.
The SDR has been sponsored by NSF with some financial contributions from other federal agencies since the early 1970s. This survey follows a sample of holders of S&E doctorates earned at U.S. institutions throughout their careers from year of doctorate degree award through age 75. Every 2 years, a sample of new S&E doctorate degree earners is added to the SDR from another NSF-sponsored survey, the Survey of Earned Doctorates (SED), which is an annual census of all recipients of research doctorates from U.S. institutions. The overall sample size is maintained by subsampling of the older cohorts to make room for the sample of new graduates.
The current SESTAT target population excludes individuals who do not hold bachelor's or higher degrees but are currently working in science or engineering fields. There is, however, a growing interest in such individuals, particularly in fields such as information technology. It may therefore be desirable to expand the target population to include individuals without bachelor's degrees who are working in at least some science or engineering fields.
In addition to possible extension of the definition of the SESTAT population, a second issue relates to coverage gaps with the current definition. The SESTAT system misses some individuals with S&E degrees and many individuals with non-S&E degrees who are working in S&E occupations at the time of a given SESTAT round. For example, the system does not include foreign citizens who received only non-U.S. degrees and entered the country after the 1990 decennial census, because an adequate sampling frame for such individuals has not been identified. Also, U.S. citizens who did not have at least a bachelor's degree at the time of the 1990 decennial census, but who subsequently obtained an S&E degree from abroad, are missed. In addition, those individuals with only non-science or non-engineering degrees at the time of the census who were working in non-science or non-engineering occupations or not working at the time of the 1993 NSCG, but who subsequently entered a science or engineering occupation, are not covered in SESTAT. Furthermore, SESTAT does not cover graduates receiving their first bachelor's degrees in non-S&E fields after April 1990 who entered S&E occupations.
A third issue is the timeliness of the data, or the elapsed time between the reference date in each survey and the date on which survey data are released. In recent years, this time period has been reduced by about 10%. Still, some SESTAT data users feel that the time period remains too long. Timeliness is not addressed directly here, however improvements in frames and/or interagency cooperation may lead to further improvements in timeliness.
Many of the users of S&E personnel data focus their attention on unique subgroups of the S&E population, e.g., the employed or the unemployed, or women and minorities in specific fields. Because the SESTAT surveys are required to produce estimates for various subgroups of interest with specified precision levels, this interest implies certain subgroups have to be oversampled to meet the specified reliability criteria. An indication of the sizes of various subgroups of interest is provided by the following brief description of selected characteristics of the S&E workforce.
The total number of employed scientists and engineers in the United States in 1997 was 10.6 million according to the SESTAT integrated database. Of these, the vast majority (10.1 million) held at least one 4-year degree in a science or engineering field. Only about 30% (3.1 million) of the 10.1 million S&E degree holders in the workforce were employed in S&E occupations. Almost 57% of the individuals employed in S&E jobs reported their highest degree type as a bachelor's degree, whereas 29% listed a master's degree and 14% a doctorate.
The private for-profit sector is by far the largest employer of individuals who are members of the SESTAT population. In 1997, 73% of scientists and engineers in the workforce whose highest degree was a bachelor's degree and 60% of those whose highest degree was a master's degree were employed in a private, for-profit company. It should be noted that they were not necessarily employed as scientists or engineers. The academic sector was the largest sector of employment for those with doctorates (49%).
Although women made up close to half (46%) of the U.S. labor force in 1997, they accounted for only slightly more than one-fifth (23%) of the S&E labor force. With the exception of Asians and Pacific Islanders, minorities compose a much smaller proportion of scientists and engineers in the United States than they do of the total U.S. population. Asians and Pacific Islanders were 10% of scientists and engineers in the United States in 1997, although they were only 4% of the total U.S. population. Blacks comprised 12%, Hispanics 11%, and American Indians and Alaskan Natives 1% of the U.S. population in 1997, whereas Blacks and Hispanics each comprised only about 3%, and American Indians and Alaskan Natives about 0.5%, of scientists and engineers.
A draft version of the material presented in the main body of this report was a key topic of discussion at an expert panel meeting about SESTAT sampling design, which was held at NSF on 5 December 2000. The agenda of the expert panel meeting appears in appendix A, and a list of meeting participants is provided in appendix B. Appendix C contains a brief summary of the meeting and the recommendations made by the panel. Appendix D lists desired and acceptable coefficients of variation (CVs) for S&E workforce estimates. Appendix E contains a discussion of the characteristics of some establishment data collections.
 Note that measurement of the number of scientists and engineers is dependent on the definition used. At the time of this redesign research, 1997 data were the latest available. More recent data are now available at the NSF/SRS website, http://www.nsf.gov/statistics/. Various definitions of the number of scientists and engineers may be found there as well.