Technical Notes[1]

General top

The 1998 occupational employment estimates at the national, state, and Metropolitan Statistical Area (MSA) level were based on data from the 1996, 1997, and 1998 Occupational Employment Statistics (OES) Surveys.[2] The OES survey is a Federal-state cooperative program that provides the states support to collect data for their own surveys so they can produce estimates of specific occupational employment by industry within their MSAs. The Bureau of Labor Statistics (BLS) provided the states with survey procedures, technical guidance, a sample for each MSA, systems for survey estimation, and troubleshooting assistance. State Employment Security Agencies (SESAs) from all fifty states, plus the District of Columbia, Puerto Rico, the Virgin Islands, and Guam participated in this survey. Occupational employment estimates were produced by BLS-Washington using employment data from these participants. State-level estimates can be obtained from the individual SESAs.

Scope of the Survey top

The survey covered private establishments in SIC codes 07, 10, 12–17, 20--42, 44–65, 67, 70, 72, 73, 75, 76, 78–84, 86, 87, and 89. The survey also covered private and government establishments in SIC codes 806, 821, 822, 824, and 829. Additionally, the survey covered state and local government establishments (excluding hospitals and education). Furthermore, a census was taken of Federal Government establishments including postal workers in the 1998 survey.

The reference date of the survey was the week that included October 12, November 12, or December 12. The pay period including the 12th day of the reference month is standard for Federal agencies collecting employment data. The reference date for any particular establishment in this survey was dependent on its two-digit SIC code. See the table below.


Industries Surveyed

October 12

07, 15-17, 41, 46, 50-62, 67, 70, 73, 79, 84

November 12

26-28, 30, 35, 36, 40, 42, 45, 47, 48, 63-65, 75, 76, 78, 80, 81, 83, 86, 87, 89

December 12

10, 12-14, 20-25, 29, 31-34, 37-39, 44, 49, 72, 82, and state and local governments

Method Collection top

Survey schedules were initially mailed to virtually all sampled establishments. Personal visits, however, were made to some of the larger establishments.

Two additional mailings were sent to nonresponding establishments at approximately six week intervals. Telephone followups and, in some cases, personal visits were made to nonrespondents considered critical to the survey because of their size.

Sampling Procedures top

The sampling frame for this survey was the list of establishments that reported to the state Unemployment Insurance (U.I.) files for the two-digit SICs listed above. Each quarter, the list from each state is compiled into a single file at BLS, called the Universe Database (UDB). For the 1996 survey, the sample frame was the UDB file from the second quarter of 1995; for the 1997 survey, it was from the third quarter of 1996; and for the 1998 survey, it was from the second quarter of 1997. These frames were supplemented with a list supplying establishment information on Railroads (SIC 401).

A census is taken of Federal Government establishments each year. Data representing Federal Government employment and wages are obtained at the end of the survey process from the Federal Government's Office of Personnel Management.

Establishments in the universe were stratified by state, MSA, three-digit SIC, and size of firm (i.e., size class).

U.I. reporting establishments that have 1–4 employees were sampled for the first time in 1998. Prior to 1998, establishments with 5–9 employees were assigned larger weights to account for the "size class 1" establishments. Establishments in higher size classes are sampled with virtual certainty across the three-year cycle of the survey. Approximately one third of these units were selected within each MSA/SIC/size class for the 1996 sample, another one third of these units were selected within each MSA/SIC/size class for the 1997 sample, and the final one third of these units were selected within each MSA/SIC/size class for the 1998 sample.

Response top

Of the 380,833 eligible units from the 1996 sample, usable responses were obtained from 276,989, producing a response rate of 72.7 percent based on units. Of the 383,861 eligible units from the 1997 sample, usable responses were obtained from 301,671, producing a response rate of 78.6 percent based on units. Of the 363,267 eligible units from the 1998 sample, usable responses were obtained from 284,159, producing a response rate of 78.2&#nbsp;percent based on units.

Estimation top

Sample Weights

Each sampled establishment was assigned an original sampling weight, the reciprocal of the establishment's probability of selection (i.e., its design weight) within its sampled year.

Weights were modified for each in-scope establishment in a cell by dividing the establishment's design weight by a factor indicating the number of years for which sample units were selected from that sampling cell. This weight was used in the calculation of the 1998 estimates based on combining data from the 1996, 1997, and 1998 surveys.


Nonresponding establishments are accounted for in the OES survey by an imputation process. The staffing pattern is imputed using a hot-deck "nearest-neighbor" imputation method. This method searches the responding establishments within a defined cell and finds the responding establishment that most closely matches the nonresponding establishment for key classification values (Area/SIC/Size Class). The staffing pattern, or employment distribution, of the responding establishment is used as the staffing pattern of the nonresponding establishment.

Combining and Benchmarking top

Multi-Year Data

In order to reduce the variability of detailed geographic level estimates, data from three years have been combined to increase the effective sample size. The 1998 OES estimates are based on three years of combined OES survey data. Each year's sample is weighted to represent the sample as it appeared at the time the sample was selected. In order to combine the data, each unit's weight is modified so that the aggregate sample represents the universe. This is done via a fairly simple procedure. Each unit weight is divided by the number of years that sample units were selected for that stratum.

A ratio estimator is used to develop estimates of occupational employment. The auxiliary variable used was the 1998 reference-month population value of total employment. In order to balance the state's need for estimates at different levels of geographic and industrial aggregation, the ratio adjustment process was applied as a hierarchical series of ratio adjustment factors, or "benchmark" factors (BMFs).[3]

Estimated Employment

As mentioned above, a ratio estimator is used to develop estimates of occupational employment. The auxiliary variable is the population value of total employment obtained from the refined Unemployment Insurance files for the 1998 reference month. Within each MSA, the estimated employment for an occupation at the reported three-digit SIC level was calculated by multiplying the weighted employment by its ratio factor. The estimated employment for an occupation at the all-industry level was obtained by summing the occupational employment estimates across all industries within an MSA reporting that occupation. Employment data for Federal Government workers in each occupation were added to the survey-derived data.

Variance of Estimates

Estimates of sampling error are calculated to allow the users to determine if occupational employment estimates are reliable enough for their needs. Only a probability-based sample can be used to calculate estimates of sampling error from the sample itself.

The formula used to estimate occupational employment variances (a common measure of sampling error) is based on the survey's sample design and method of estimation. The OES survey used a subsample replication technique called the jackknife random group to estimate variances of occupational employment. In this technique each sampled establishment is assigned to one of G random groups. Using the data in these groups, G subsamples are formed from the parent sample. Next, G estimates of total employment for an occupation P are calculated, one employment estimate per subsample. Afterwards, the variability of these G employment estimates is calculated. This variability is our variance estimate of occupation P's employment estimate.

Reliability of the Estimates top

Estimates developed from a sample may differ from the results of a census. Two types of error, sampling and nonsampling, can occur in estimates calculated from a sample. Sampling error occurs because our observations are based on a sample, not on the entire population. Nonsampling error occurs because of response and operational errors in the survey. Unlike sampling error, this form of error can also occur in a census.

Sampling Errors

The particular sample used in this survey is one of a large number of many possible samples of the same size that could have been selected using the same sample design. Estimates derived from different samples would tend to differ from one another. The variance of a survey estimate is a measure of the variation among the estimates from all possible samples. The standard error of a survey estimate is the square root of its variance; the relative standard error is the ratio of the standard error to the estimate itself.

Using the sample estimate and its standard error allows the user to construct an interval estimate with a prescribed level of confidence that the interval will include the mean value of the estimate from all possible samples.

For example, suppose that an estimated occupational employment total is 5,000 with an associated relative standard error of 2.0 percent. Based on these data, the standard error of the estimate is 100 (2 percent of 5,000). A 68-percent confidence interval for the employment estimate is (5,000 ± 100) or (from 4,900 to 5,100). Approximately 68 percent of the intervals constructed in this manner will include the mean of all possible employment estimates as computed from all possible samples. A 95-percent confidence interval for the employment estimate is (5,000 ± 200) or (4,800 to 5,200). Approximately 95 percent of the intervals constructed in this manner will include the mean of all possible employment estimates as computed from all possible samples. Estimates of sampling errors for occupational employment estimates are available for most estimates.

Nonsampling Error

This type of error is attributable to several causes such as an inability to obtain information for all establishments in the sample; differences in the respondents' interpretation of the survey question; an inability or unwillingness of the respondents to provide correct information; errors made in recording, coding, or processing the data; and errors made in imputing values for missing data. Explicit measures of the effects of nonsampling error are not available.

Several edit and quality control procedures were used to reduce nonsampling error. For example, completed survey questionnaires were checked for data consistency. Followup mailings were sent out to nonresponding establishments to improve the survey response rate. Response analysis studies were conducted to assess the respondents' comprehension of the questionnaire. See the section below for additional information on the quality control procedures used by the OES survey. The relative standard error indicates the magnitude of the sampling error. It does not measure nonsampling error, including any biases in the data. Particular care should be exercised in the interpretation of small estimates or in small differences between estimates when the sampling error is relatively large or the magnitude of the bias is unknown.

There were 846 occupational categories defined for the 1996, 1997, and 1998 OES surveys. Within each three-digit industry, an average of 153 occupations are explicitly collected. Occupations that occur only rarely within an industry are collected in "All Other" residual categories.Because of the All Other residual categories, the "All Industry" total employment for some occupations may be underestimated. That is, BLS expects that the true population value for an occupation, across all industries, is its weighted employment based on its occupation code and some portion of the weighted residual employment from the "All Other" categories.

The magnitude of this bias is unknown. In general, however, occupational employment within the "All Other" residual categories is not a significant proportion of the total employment of any specific occupation. Employment coded in residual occupations accounts for 10.4 percent of the total occupational employment at the national level. Note that some portion of the employment coded in residual occupations is correctly coded as residual employment. That is, there are cases where an occupation is a relatively new occurrence, and is not yet represented on any questionnaire as a specific occupation. There are also occupations that have declined in employment to the point where they are no longer represented as specific occupations on any questionnaire.

Quality Control Measures top

A major concern with a cooperative program like the OES survey is to accommodate state-specific publication needs with limited resources while standardizing survey procedures across all fifty states, the District of Columbia, and the U.S. territories; and in the process, produce quality estimates. Controlling sources of nonsampling error in this decentralized environment can be difficult. In addition, edit and validation checks are distributed across eight regional offices, which can lead to procedural differences between the regions. Two important quality control measures employed by the OES survey are the Survey Processing and Management (SPAM) System and the Estimates Delivery System (EDS). Both systems were developed to provide a consistent and automated framework for survey processing and to reduce the workload at the state, regional, and national levels.

By standardizing data processing activities such as refining mailing addresses, addressing envelopes and mailers, editing and updating questionnaires, producing management reports, and calculating employment estimates, the SPAM system and the EDS have consequently standardized survey methodology. This has reduced the number of errors on the data files as well as the time needed to review them.

Other quality control measures implemented in the OES survey include


[1] Extensive portions of the material in these Technical Notes have been excerpted or reproduced verbatim from U.S. Department of Labor, Bureau of Labor Statistics: Occupational Employment and Wages, 1998 (Bulletin 2528, June 2000, "Appendix B. Survey Methods and Reliability of the 1998 Occupational Employment Statistics Estimates," pp. 125--131). Readers are encouraged to consult that Appendix for more complete explanations. Until supplanted by later survey cycles, the 1998 Appendix B can be accessed at Thereafter, users should consult the OES Home Page at

[2] Although estimates are provided here only at the national level, the national estimates are calculated from state and MSA data, as described in the text and more fully in Bulletin 2528.

[3] See 1998 Appendix B, cited in note (1) above, for more detail on benchmarking.

Previous Section Top of page Next Section Table of Contents Help SRS Home