Characteristics of Recent Science and Engineering Graduates: 2006
Appendix A. Technical Notes
These technical notes on the 2006 National Survey of Recent College Graduates (NSRCG) include information on sampling and weighting, survey methodology, sampling and nonsampling errors, as well as discussions on data comparisons to previous cycles of the NSRCG and the Integrated Postsecondary Education Data System (IPEDS) data. For a more detailed discussion of survey methodology, refer to the 2006 NSRCG methodology report (available upon request).
The NSRCG is sponsored by NSF's Division of Science Resources Statistics (SRS). NSRCG is one of three SRS data collections covering personnel and graduates in science, engineering, and health (SEH) fields. The other two surveys are the National Survey of College Graduates (NSCG) and the Survey of Doctorate Recipients (SDR). Together, they constitute NSF's Scientists and Engineers Statistical Data System (SESTAT). These surveys serve as the basis for developing estimates and characteristics of the total population of scientists and engineers in the United States.
The NSRCG has been conducted every 2 to 3 years for the NSF since 1974. The NSRCG data are used to understand the early employment experiences of recent graduates, such as the extent to which recent graduates entered the labor force, whether they were able to find employment and the attributes of that employment, and continuing education.
Target Population and Sample Design
The 2006 NSRCG target population consisted of individuals with the following characteristics:
The NSRCG used a two-stage sample design. In the first stage, a stratified nationally representative sample of 300 educational institutions was selected. Each sampled institution was asked to provide lists of graduates for sampling. In the second stage, the graduates with bachelor's or master's degrees in science, engineering, and health fields were identified and included in the sampling frame. Within graduation year (cohort), each eligible graduate was then classified into 1 of 756 sampling strata based on the cross-classification of the following variables:
The sampling rates by stratum were applied within each eligible responding institution and resulted in sampling 27,000 graduates (9,000 from each academic year): 19,550 bachelor's and 7,450 master's degree recipients.
Data Collection and Response Rates
The 2006 NSRCG was performed in two stages: a graduate list collection, conducted by the Mathematica Policy Research, Inc., under contract to NSF, and a graduate survey collection, conducted by the U.S. Bureau of the Census under interagency agreement with NSF.
For the first stage of graduate list collection, the educational institutions were asked to provide the list of graduates during the academic years 2002–05. Of the 300 sampled institutions, 295 provided lists of graduates for sampling in the 2006 NSRCG and 5 chose not to provide graduate lists. The graduate list collection had a 98.7% unweighted response rate and a 97.2% weighted response rate.
For the second stage survey of graduates, data collection consisted of two phases: a self-administered mail survey, followed by computer-assisted telephone interviewing (CATI) of mail nonrespondents. Extensive efforts were undertaken to locate the graduates. The overall unweighted graduate response rate was 68.2%; the weighted response rate was 69.0%.
Survey response rates varied somewhat by the characteristics of the graduates. Rates were lowest for graduates identified on the school sampling lists as non-U.S. citizens. It is possible that many of the non-U.S. citizens in the sample who were not located were actually ineligible for the survey because they returned to their home countries after receiving their degrees in the United States. Sample persons were classified as ineligible only if their status could be confirmed.
Data Editing and Coding
Data preparation for the 2006 NSRCG consisted of clerical pre-key editing and keying of mail questionnaires and coding operations. The computer editing of mail questionnaire data was performed on a regular basis during the data collection to identify cases with missing critical items (work status, job code, resident status, and birthdate). Telephone callbacks were made to obtain responses to these missing critical items; otherwise they were considered as nonresponse. Overall, about 4% of the completed-mail respondents required telephone follow-ups for responses to the missing critical items.
The coding operation involved special coding of occupation and education codes, other-specify coding, state and country coding, and IPEDS coding. For special coding of occupation, the respondent's occupational data were reviewed along with other work-related data from the questionnaire by specially trained coders to correct known respondent self-reporting problems to obtain the best occupation codes. The education code for a newly earned degree was assigned strictly based on the degree-field verbatim.
Imputation of Missing Data
Item nonresponse rates were generally less than 3%. Nonresponse to a few questions deemed somewhat sensitive, such as annual salary, were around 5.4%. Item nonresponse was imputed using logical and hot deck imputation methods.
To enable the weighted analysis of the 2006 NSRCG data, a sample weight was calculated for every person in the sample. The weighting procedures adjusted for unequal selection probabilities, for nonresponse at the institution and graduate level, and for de-duplication of the same graduates selected multiple times on the sampling file. In addition, a ratio adjustment was made at the institution level, using the number of degrees awarded as reported in IPEDS, for specified categories of major and degree level. Because this adjustment was designed to reduce the variability associated with sampling institutions, it was not affected by the differences in target populations between NSRCG and IPEDS at the person level. These differences between NSRCG and IPEDS are discussed in "Comparisons with IPEDS Data," below.
The final adjustment to the graduate weights adjusted for those responding graduates who could have been sampled twice. For example, a person who obtained an eligible bachelor's degree in academic year 2003 could have obtained an eligible master's degree in 2005 and could have been sampled for both degrees. Two types of weights were developed for the 2006 NSRCG: full NSRCG sample weights for use in computing survey estimates and replicate weights for variance estimation using a jackknife replication variance estimation procedure.
Reliability of Estimates
The survey estimates provided in these tables are subject to two sources of error: sampling and nonsampling errors. Sampling errors occur because the estimates are based on a sample of individuals in the population rather than on the entire population and hence are subject to sampling variability. If the interviews had been conducted with a different sample, the responses would not have been identical; some estimates might have been higher, and others might have been lower.
Sampling error is measured by the variance, or standard error, of the survey estimate. The variances on the survey estimates were computed using a technique known as jackknife replication. As with any replication method, jackknife replication involves constructing a number of subsamples (replicates) from the full sample and computing the statistics of interest for each replicate. The mean square error of the replicate estimates around their corresponding full sample estimate provides an estimate of the sampling variance of the statistic of interest. The replicate weights file is available for direct calculation of standard errors on survey estimates.
In addition to sampling errors, the survey estimates are subject to nonsampling errors that can arise because of survey nonresponse or coverage errors, reporting errors, and reporting or data processing errors. These errors can sometimes bias the data. The 2006 NSRCG included procedures specifically designed to minimize nonsampling errors. In addition, a nonresponse bias study conducted on the 2003 NSRCG data provided some measures of nonsampling errors that are useful in understanding the data from the current survey as well.
Procedures to minimize nonsampling errors were followed throughout the survey development. Extensive questionnaire redesign work incorporated results of efforts to reduce reporting errors using cognitive interviews, expert panel reviews, and mail pretests. This questionnaire design work was done in conjunction with the other two SESTAT surveys.
Comprehensive training and monitoring of data-processing staff and telephone interviewers helped to ensure the consistency and accuracy of the data. Nonresponse was handled in ways designed to minimize the impact on data quality (through weighting adjustments and imputation). In data preparation, a special effort was made in the area of occupational coding. Respondent-chosen codes were verified by specially trained coding staff using a variety of information collected on the survey and applying coding rules developed by NSF for the SESTAT surveys.
Comparing the 2006 NSRCG to Previous Cycles
It is important to exercise caution when making comparisons with previous NSRCG results. During the 1993 cycle, the SESTAT surveys, including the NSRCG, underwent considerable revision in several areas. These areas include survey eligibility, data collection procedures, questionnaire content and wording, and data coding and editing procedures. The changes made for the 1995 through 2006 cycles were less significant but might affect some trend data analysis. Although the 1993 through 2006 survey data are fairly comparable, care must be taken when comparing results from the 1990s surveys to surveys from the 1980s, due to significant changes made in 1993. For a detailed discussion of these changes, refer to the 1993 NSRCG methodology report (available upon request).
In all years except 2006, data were collected on graduates with bachelor's and master's degrees in two academic years immediately prior to the survey year. However, in 2006, data were collected from graduates in three academic years. Beginning with the 2003 survey, data were collected and reported on graduates with bachelor's and master's degrees in health fields as well as science and engineering fields.
In years prior to 2003, data on employed graduates were presented only in two categories: by employment in science and engineering (S&E) occupations and by employment in non-S&E occupations. Beginning in 2003, a third category of S&E-related occupations was added. S&E-related occupations include health occupations, S&E managers, S&E precollege teachers, S&E technicians and technologists, and other S&E-related occupations, such as architects and actuaries.
Overall estimates from the 2003 and 2006 NSRCG cannot be directly compared to the 2001 or earlier NSRCG results unless the 2006 respondents with health degrees are excluded in the data comparisons.
Comparisons with IPEDS Data
The National Center for Education Statistics (NCES) conducts a set of surveys of the nation's postsecondary institutions, called the Integrated Postsecondary Education Data System (IPEDS). One of these, the IPEDS Completions Survey, reports the number of degrees awarded by all major fields of study, along with estimates by sex and race/ethnicity.
Although both the first stage of NSRCG and IPEDS Completions Survey collect the similar degree completion data from postsecondary institutions, important differences in the target populations for the two surveys directly affect estimates on the number of graduates. The reason for the different target populations is that the goals of the surveys are not the same. The IPEDS estimates of degrees awarded are intended to measure the output of the educational system. The NSRCG estimates are intended to measure the supply and utilization of a portion of graduates in the years after they completed their degrees. These differing goals result in definitions of the target population that are not completely consistent for the two surveys.
The main differences between the two surveys that affect comparisons of estimates overall and by race/ethnicity are as follows:
NSRCG and IPEDS estimates are consistent, however, when appropriate adjustments for these differences are made. For example, the proportional distributions of graduates by field of study are nearly identical, and the numerical estimates are similar. More information on the comparison of NSRCG and IPEDS estimates is available in A Comparison of Estimates in the NSRCG and IPEDS (available upon request).
The 2006 NSRCG maintained the questionnaire design changes that were implemented in 2003 (for the 2006 survey questionnaire, see appendix C). The questionnaire comprises a large set of core data items that are retained in each survey round to enable trend comparisons. Each survey year, different sets of module questions on special topics of interest are included. The 2003 NSRCG questionnaire had a module on type of academic positions, faculty rank and tenure of those working in the education sector, patent and publication activities, individual satisfaction and importance of various job attributes, and immigration-related questions for foreign-born graduates. These module questions were dropped in 2006 and new questions were added as follows:
Full-time salary: The annual median salary for the full-time employed, defined as those who were not self-employed (either incorporated or not incorporated), whose principal job was not less than 35 hours per week, and who were not full-time students during the survey reference week.
Labor force: Includes individuals working full or part time as well as those not working but seeking work or on layoff. It is a sum of the employed and the unemployed.
Major field of study: Derived from the field of degree as specified by the respondent and classified into the SESTAT education codes (see appendix D, tables D-1 and D-3).
Occupation: Derived from responses to several questions on the type of work primarily performed by the respondent. The occupational classification into the SESTAT occupation codes was based on respondent's principal job held during the survey reference week of 1 April 2006—or last job held, if not employed in the reference week (see appendix D, table D-2).
Primary work activity: The activity that occupied the most time on the respondent's job. In reporting the data, those who reported applied research, basic research, development, or design work were grouped together in "research and development (R&D)." Those who reported accounting, finance or contracts, employee relations, quality or productivity management, sales and marketing, or managing and supervising were grouped into "management, sales, administration." Those who reported production, operations, maintenance, professional services, or other activities were grouped into "other."
Race/ethnicity: All graduates, both U.S. citizens and non-U.S. citizens, are included in the race/ethnicity data presented in this report. American Indian/Alaska Native, Asian, black, Native Hawaiian/Other Pacific Islander, white, and persons reporting more than one race refer to non-Hispanic individuals only.
Type of employer: The sector of employment in which the respondent was working on his or her primary job held during the survey reference week. Private industry and business includes all private for-profit and private not-for-profit companies, businesses, and organizations, except those reported as educational institutions. It also includes persons reporting that they were self-employed. Educational institutions includes elementary and secondary schools, 2-year and 4-year colleges and universities, medical schools, university-affiliated research organizations, and all other educational institutions. Government includes local, state, and federal government, military, and commissioned corps.
Unemployed: The unemployed are those who were not working during the survey reference week and were seeking work or were on layoff from a job.
Coverage of Tables
The tables in this report present information for two groups of recent graduates. The even numbered tables are for those who earned bachelor's degrees in SEH fields from U.S. institutions during academic years 2003, 2004, and 2005. The odd numbered tables are for those who earned SEH master's degrees during the same 3 years. Standard error tables are presented as a separate set and are included in appendix B.
 Dajani A, Maples J. 2005. NSF/RCG Nonresponse Bias Analysis: Part I. U.S. Bureau of the Census.