Skip to content

Technical Notes

The 2018 Science and Engineering Indicators (SEI) State Indicators data tool contains trend data for most indicators. These data are available for download within the data tool and from the State Indicators download page.

1. Standard Errors

The SEI State Indicators data tool contains data compiled from a large number of sources, which can be categorized as follows:

  • Data based on censuses. These are complete population counts; therefore, there is no standard error associated with the estimate. Data or tables where standard errors are not applicable are labeled “na.”
  • Data based on samples. Standard errors for estimates, where available, are provided by the source. The National Assessment of Education Progress (NAEP) data sets are the only data sets with complete standard errors for the time periods included in the State Indicators data tool. Standard errors are incomplete in other data sets based on samples; for example, standard errors are not available prior to 2007 for the Occupational Employment Statistics (OES) survey. The Business R&D Survey (BRDIS) data set has associated standard errors for its values, but some historical values of standard errors are not available due to updates to the estimates but not to the standard errors.
  • Data based on statistical models. Standard errors cannot be provided for some estimates due to the estimating techniques of the data source (for example, gross domestic product [GDP] data and Census-based population estimates). Data or tables where standard errors are not available are labeled “NA.”
  • Data derived directly from the source data set. For data series where the standard error information for the source data is available, approximation formulas for combining sampling errors were used. Because the source data used to derive these estimates are from different independent samples, there is no covariance term included in the formulas.

Standard error tables are provided for download for all State Indicators data where the standard errors are appropriate and available. In some cases, standard error information was not available for a data series. This is noted on the website.

The following formulas were used to estimate standard errors for derived data series.

Sums and differences

Where available for aggregate estimates, such as the total for the United States, sampling errors were collected for the aggregate estimate as provided by the source.

In a few cases, aggregate estimates were calculated from individual parts of the aggregate, and therefore, sampling errors also had to be calculated based on the individual parts of the aggregate. The same formula was also used for computing the standard error for the difference of two estimates. It was assumed that the covariance between the individual parts was negligible.

This formula was used, where applicable, for such roll-ups as national values or occupation categories (e.g., computer science).

The standard error of the sum or difference of two estimates is the square root of the sum of the variances of the two estimates. The variance of an estimate is the squared standard error of the estimate.

Quotient

This formula was used to calculate the standard errors of the ratios (assuming X and Y are uncorrelated, using the first order Taylor series expansion, which is an approximate but widely used and accepted approach).

The standard error of the ratio of two estimates is the product of the ratio (of the two estimates) and the square root of the sum of the relative variance of the numerator and the relative variance of the denominator. The relative variance of an estimate is the variance of the estimate divided by the squared estimate.

Relative standard error

Errors for all data from the Occupational Employment Statistics Survey and some data from the Business Research and Development and Innovation Survey were only available as the relative standard error (RSE) or percent relative standard error (PRSE).

The percent relative standard error of an estimate is the standard error of the estimate divided by the estimate expressed as a percentage.

Therefore, to transform the PRSE to standard error, the following equation was used:

The standard error is the product of the estimate and the percent relative standard error.

2. Constant Dollar Data

The SEI State Indicators data tool presents data as current dollars. To facilitate comparisons over time, the data tool also has an option for presentation of the information as constant dollars in the table and chart views. The data tool uses constant 2009 dollars based on the gross domestic product (GDP). The specific values are from August 2017 adjusted to a calendar-year basis, as prepared by the Bureau of Economic Analysis. The constant dollar adjustment is available in the State Indicators data tool for all financial indicators, except for Indicator S-9 (Public School Teacher Salaries).

Table S-A provides the GDP price deflators used in the State Indicators data tool. These price indices are for the national GDP and are not adjusted for states. The State Indicators tables that are available for download present information as current dollars only. The data in Table S-A can be used to replicate the constant dollar information in the State Indicators data tool. It may also be applied to the standard error tables, as applicable.

3. Statistical Testing

As noted in the overview, indicators based on estimates have associated standard errors, and therefore, small differences in numbers may not be statistically significant.

4. High Science, Engineering, and Technology Employment Industries

To define high science, engineering, and technology (SET) employment industries, this tool uses a modification of the approach employed by the Bureau of Labor Statistics (BLS; Hecker 2005). BLS’s approach is based on the intensity of high SET employment within an industry. High SET employment occupations include scientific, engineering, and technician occupations. These occupations employ workers who possess an in-depth knowledge of the theories and principles of science, engineering, and mathematics, which is generally acquired through postsecondary education in some field of technology. An industry is considered a high SET employment industry if employment in technology-oriented occupations accounts for a proportion of that industry’s total employment that is at least twice the average for all industries (i.e., 9.8% or higher in 2002, the data that Hecker used). Ideally, this method would be used to develop a list of high SET industries for each year in the State Indicators data tool. However, due to the time required to obtain the data for the custom list, the data tool uses the list Hecker published for all years.

Because the category “high SET employment industries” refers only to private-sector businesses, we excluded “Federal Government, excluding Postal Service” from high-technology industries. Each industry is defined by a four-digit code that is based on the North American Industry Classification System (NAICS). The NAICS classifications are periodically revised, thereby affecting the trend data presented in the tables. For data years up through 2008, the 2002 NAICS codes were used to define business establishments. Date for 2009 to 2012 use 2007 NAICS codes, and subsequent years use 2012 NAICS codes. Table S-B displays the lists of high-technology industries used for each year in this tool.

5. National Assessment of Education Progress Data by Demographic Characteristics

Additional data on the NAEP science and math scores are provided in the Chapter 1 appendix tables for download. These tables show the NAEP data by sex and race/ethnicity.

6. States Included on the Histogram Display

To aid in visualizations, outliers are not displayed on histograms. Here we define an “outlier” as a data point greater than the median plus 3 times the interquartile range of the most recent year of the data series.

Reference

Hecker D. 2005. High-technology employment: A NAICS-based update. Monthly Labor Review 128(7):57–72.

Table S-A

Calendar-year price deflators State Data Tool, 1990–2017

YearGDP price deflator (chained) 2009 dollars
19900.66773
19910.68996
19920.70569
19930.72248
19940.73785
19950.75324
19960.76699
19970.78012
19980.78859
19990.80065
20000.81887
20010.83754
20020.85039
20030.86735
20040.89120
20050.91988
20060.94814
20070.97337
20080.99246
20091.00000
20101.01221
20111.03311
20121.05214
20131.06913
20141.08832
20151.10012
20161.11416
20171.13421

GDP = gross domestic product.

NOTE: The base year (= 1.0000) for the constant dollar calculations continues to be 2009, consistent with the current Bureau of Economic Analysis and Office of Management and Budget convention.

SOURCES: Bureau of Economic Analysis, National Economic Accounts, Gross Domestic Product, accessed 21 May 2018.

Science and Engineering Indicators 2018

Table S-B

NAICS codes that constitute high-SET employment industries

2002 NAICS code2007 NAICS code2012 NAICS codeIndustry
113111311131Timber track operations
113211321132Forest nurseries and gathering of forest products
211121112111Oil and gas extraction
221122112211Electric power generation, transmission, and distribution
324132413241Petroleum and coal products manufacturing
325132513251Basic chemical manufacturing
325232523252Resin, synthetic rubber, and artificial synthetic fibers and filaments manufacturing
325332533253Pesticide, fertilizer, and other agricultural chemical manufacturing
325432543254Pharmaceutical and medicine manufacturing
325532553255Paint, coating, and adhesive manufacturing
325932593259Other chemical product and preparation manufacturing
333233323332Industrial machinery manufacturing
333333333333Commercial and service industry machinery manufacturing
333633363336Engine, turbine, and power transmission equipment manufacturing
333933393339Other general purpose machinery manufacturing
334133413341Computer and peripheral equipment manufacturing
334233423342Communications equipment manufacturing
334333433343Audio and video equipment manufacturing
334433443344Semiconductor and other electronic component manufacturing
334533453345Navigational, measuring, electromedical, and control instruments manufacturing
334633463346Manufacturing and reproducing magnetic and optical media
335333533353Electrical equipment manufacturing
336433643364Aerospace product and parts manufacturing
336933693369Other transportation equipment manufacturing
423442344234Professional and commercial equipment and supplies, merchant wholesalers
486148614861Pipeline transportation of crude oil
486248624862Pipeline transportation of natural gas
486948694869Other pipeline transportation
511251125112Software publishers
5161nanaInternet publishing and broadcasting
na519130519130Internet publishing and broadcasting and Web search portals
517151715171Wired telecommunications carriers
517251725172Wireless telecommunications carriers (except satellite)
5173nanaTelecommunications resellers
517451745174Satellite telecommunications
517951795179Other telecommunications
5181nanaInternet service providers and Web search portals
518251825182Data processing, hosting, and related services
521152115211Monetary authorities, central bank
523252325232Securities and commodity exchanges
541354135413Architectural, engineering, and related services
541554155415Computer systems design and related services
541654165416Management, scientific, and technical consulting services
541754175417Scientific research and development services
551155115511Management of companies and enterprises
561256125612Facilities support services
na561312561312Executive search services
811281128112Electronic and precision equipment repair and maintenance

na = not applicable.

NAICS = North American Industry Classification System; SET = science, engineering, and technology.

NOTES: Data on high-tech industries for 2008 and earlier years were compiled using the 2002 NAICS codes. Data for 2009 to 2012 were compiled using the 2007 NAICS codes, and subsequent years use 2012 NAICS codes.

Science and Engineering Indicators 2018