Research and Development in Industry: 2004
This report is the second of two publications containing results from the 2005 Survey of Industrial Research and Development. The first publication, an InfoBrief (NSF 2007c) announcing the availability of survey results, contains analytical information and highlights the increase in expenditures for industrial R&D funded from companies' own resources. This report contains the full set of statistics produced from the survey, including statistics on R&D funding during the calendar year 2005 and on R&D personnel in January 2006. Among the tables are several that include statistics on trends in industrial R&D since 1953, statistics on employment by R&D-performing firms since 1994, and a table classified by state that contains statistics for selected years since 1991. This report also contains (in the technical notes in appendix A) information about the industry coding classification system, company size classifications, survey methodology, comparability of the statistics over time and with other statistical series, survey definitions, history of the survey, and other information designed to convey to the data user what the survey statistics represent and, in some cases more importantly, what they do not represent. Survey questionnaires, instructions, and other documents are reproduced in appendix B.
This report provides national estimates of the expenditures on R&D performed within the United States by industrial firms, whether U.S. or foreign owned. Among the statistics are estimates of total R&D, the portion of the total financed by the federal government, and the portion financed by the companies themselves or by other nonfederal sources, such as state and local governments or other industrial firms under contract or subcontract. Total R&D is also separated into the types of costs, wages and fringe benefits of R&D staff, materials and supplies, depreciation, and other costs. Other statistics include R&D financed by domestic firms but performed outside the 50 U.S. states and District of Columbia, R&D performed by organizations outside the firm, R&D performed in collaboration with other organizations, and the funds spent to perform energy-related R&D. Also, this report provides information on R&D-performing firms, including domestic net sales, number of employees, number of R&D-performing scientists and engineers, geographic location where the R&D was performed, and R&D funds spent per R&D-performing scientist and engineer.
The National Science Foundation Act of 1950, as amended, authorizes and directs the National Science Foundation (NSF) "to provide a central clearinghouse for the collection, interpretation, and analysis of data on scientific and engineering resources and to provide a source of information for policy formulation by other agencies of the federal government." The Survey of Industrial Research and Development is the vehicle with which NSF carries out the industrial portion of this mandate, and NSF's Division of Science Resources Statistics has sponsored and managed a survey of industrial R&D since 1953. The 1953–56 surveys were conducted by the Bureau of Labor Statistics (BLS) in the U.S. Department of Labor (NSF 1956, 1960). Since 1957, the Bureau of the Census in the U.S. Department of Commerce has conducted the survey. Data obtained in the earlier BLS surveys are not directly comparable with Census figures because of methodological and other differences. Census conducts the survey under Title 13 of the United States Code, which prohibits publication or release of data or statistics that may reveal information about individual companies. In some tables in this report, the symbol D is used to indicate that estimates are withheld to avoid possible disclosure of information about operations of individual companies.
The Survey of Industrial Research and Development is an annual sample survey that intends to include or represent all for-profit R&D-performing companies, either publicly or privately held. Respondents receive detailed definitions to help them determine which expenses to include or exclude from the R&D data that they provide. Nevertheless, the statistics presented in this report are subject to response and concept errors caused by differences in the way respondents interpret the definitions of R&D activities and by variations in company accounting procedures. The survey's primary focus is on U.S. industry as a performer of, rather than as a source of funds for, R&D. Thus, data on federal support of R&D activities performed by industry are collected, and the resulting statistics appear in several tables whereas only limited statistics on industrial funding of R&D undertaken at universities and colleges and other nonprofit organizations are collected.
The result of collecting and publishing performer-reported statistics is that the federally funded R&D performance totals presented in this report differ from the totals reported by the federal agencies that provide the funds and the statistics published in NSF's Federal Funds for Research and Development report series (http://www.nsf.gov/statistics/fedfunds/). One reason for these differences is that performers of R&D often expend federal funds in a year other than the one in which the federal government provides authorization, obligations, or outlays (see "Comparisons to Other Statistical Series" in appendix A for definitions of these terms). During the past decade, the differences have widened between the federal R&D funding reported by performers and that reported by funding agencies. These differences are documented and analyzed in the latest editions of the National Science Board's Science & Engineering Indicators (NSB 2010) (http://www.nsf.gov/statistics/seind10/) and NSF's National Patterns of R&D Resources (NSF 2008d) (http://www.nsf.gov/statistics/natlpatterns/) report series.
The content of the Survey of Industrial Research and Development has been expanded and refined over the years in response to an increasing need by policymakers for more detailed information on the nation's R&D effort. For example, questions on energy R&D were added in the early 1970s, following that decade's oil shortage crisis. And, more recently, questions that probe companies' collaborative R&D activities and funding of international performance of R&D have been added to keep up with the fast-changing environment of the conduct and organization of industrial R&D. On the other hand, collection of certain data items has been eliminated in an attempt to alleviate some of the burden on respondents. For large firms known to perform R&D, a detailed survey questionnaire (Form RD-1) is used to collect data. To limit the reporting burden on small R&D performers and on firms included in the sample for the first time, an abbreviated survey questionnaire (Form RD-1A), which collects only the most crucial data, is used.
Changes have been made to the survey throughout its history and some of the most recent are detailed in appendix A (see "Comparability of Statistics"). Specific changes are detailed in each of the annual reports resulting from the survey (http://www.nsf.gov/statistics/industry/).
Industry statistics in this report were developed from data collected from individual companies. Because the survey is company based rather than establishment based, all data collected for the various components of each company (plants, divisions, subdivisions, etc.) were tabulated in the company's major industrial classification, which was based on payroll. (See "Frame Creation and Industry Classification" in appendix A for more information about industry classification.) The resulting industry estimates were calculated by summing the data for companies classified within each major industry classification. National totals were then estimated by summing the industry estimates. The North American Industry Classification System (NAICS) was used to determine a company's major industrial classification, and the resulting statistics are published by NAICS code. For years prior to 1999, the Standard Industrial Classification (SIC) system was used. The development and ongoing refinement of NAICS has been a joint effort of statistical agencies in Canada, Mexico, and the United States. The system replaced the Standard Industrial Classification (1980) of Canada, the Mexican Classification of Activities and Products (1994), and SIC (1987) of the United States. (For a detailed comparison of NAICS to SIC of the United States, visit http://www.census.gov/epcd/www/naics.html.) NAICS was designed to provide a production-oriented system under which economic units with similar production processes are classified in the same industry. NAICS was developed with special attention to classifications for new and emerging industries, service industries, and industries that produce advanced technologies. NAICS not only facilitates comparability of information about the economies of the three North American countries but potentially increases comparability with the two-digit level of the United Nations International Standard Industrial Classification (ISIC) system.
For the 2004 and 2005 surveys, some companies' electronically assigned industry codes were manually examined and changed. The result was that most of the R&D previously attributed to NAICS 42 and 55 industries was redistributed. For detailed information, see NSF 2007d and NSF 2009. Due to the reclassification, tables that traditionally provided data by industry for one or more historical years now only show data for the study year.
Availability of Survey Results
Detailed historical statistics for 1953–98 can be obtained from NSF's Industrial Research and Development Information System (IRIS) at http://www.nsf.gov/statistics/iris/, an online interface to the Survey of Industrial Research and Development Historical Database (SIRDHD) (NSF 2001b). The SIRDHD is a collection of more than 2,500 statistical tables containing all of the statistics produced and published from the 1953–98 cycles of the annual Survey of Industrial Research and Development. Statistics for 1991–2005 are available in separate reports at http://www.nsf.gov/statistics/industry/.
Companies were categorized by total number of domestic employees. The survey excludes companies with fewer than five employees to limit the burden on small business enterprises in compliance with the Office of Management and Budget's (OMB) guidelines for federal government data collection activities. The following are the size classes used in this report:
Current and Constant Dollars
Statistics in all tables are reported in current dollars. Constant dollars also are presented in tables 2, 25, 26, and 27. Gross domestic product (GDP) implicit price deflators were used to convert current to constant 2000 dollars.
Disclosure and Suppression of Statistics
Title 13 of the United States Code and a pledge of confidentiality to respondents prohibit publication or release of data or statistics that may reveal information about individual companies. Therefore, the data in some table cells have been suppressed and replaced with "D." This occurs when a small number of companies account for a large percentage of the estimate in a particular data cell. Although publication of certain cells may be withheld, the estimates in the cells are always included in totals. The tables most often affected by cell suppression are those that contain data on federal support for industrial R&D performance.
The statistics in this report cover only those operations located in the 50 U.S. states and the District of Columbia (DC). Statistics on company-sponsored R&D performed outside the 50 U.S. states and DC are included in tables 14 and 15 but excluded from all other tables.
Beginning with 2001, the methodology to produce statistics by state was modified from previous years to address the recurring problem of large year-to-year variation in many state estimates. This variability was caused by many factors, including the potential inefficiency of the sample at state levels, the rarity of R&D expenditures, and the large weights often associated with companies that report R&D in the survey for the first time. Under the new methodology, a portion of the amount of R&D reported by some companies not selected for the sample with certainty is allocated (or raked) among all the states in which there was industrial activity. The new methodology was also applied retroactively to statistics for 1998–2000. In tables 29–31 statistics for 1998–2005 are flagged with an "e" if more than 50% of the estimate was imputed because of raking. Note that there was no change to the methodology for estimating the number of R&D performers in each state. This estimate continued to be calculated by summing the weights of the companies that actually reported R&D activity in a given state. For a more detailed explanation of the new methodology and the definition of a "certainty" company, see the technical notes.
The Survey of Industrial Research and Development has been conducted annually since 1953. Statistics for 1953–98 are reported by Standard Industrial Classification (SIC) code, and statistics for 1999–2005 are reported by North American Industrial Classification System (NAICS) codes (see below). All of the statistics produced from the survey for 1953–98 are available in the Industrial Research and Development Information System at http://www.nsf.gov/statistics/iris/. An electronic database for post-1998 statistics has not been developed yet; however, annual reports for 1991–2005 are posted at http://www.nsf.gov/statistics/industry/ (NSF 1994b, 1995b, 1996c, 1997b, 1998b, 1999b, 2000b, 2002c, 2003a, 2005b, 2006c, 2007a, 2009). Short reports that announce the availability of survey results and contain analytical information and highlight the expenditures for industrial R&D funded from companies' own resources and by the federal government also are available at http://www.nsf.gov/statistics/industry/ (NSF 1995a, 1996a, 1997a, 1998a, 1999a, 2000a, 2001a, 2002a, 2003b, 2004, 2005a, 2006a, 2007c).
Prior to the 1999 report, most historical tables classified by industry contained the current survey's statistics plus statistics for 10 previous years. Because of the conversion to NAICS and a change in the way industry codes are assigned during statistical processing (see below), tables that traditionally provided data by industry for one or more historical years now only show data for the study year.
During initial statistical processing, one North American Industry Classification System (NAICS) code was electronically assigned to each company. Multi-establishment companies were assigned single codes based on the most dominant aggregated activity for that firm in terms of total payroll. The 2002 version of NAICS was used for the 2005 survey and statistics for the following industries and industry groupings are published in this report:
Beginning with the 2004 survey and continuing for 2005, some companies' electronically assigned industry codes were manually examined and changed. Beginning in the late 1990s, increasingly large amounts of R&D were attributed to the wholesale trade industries, resulting from the payroll-based methodology used to assign industry classifications and the change from the Standard Industrial Classification (SIC) system to the North American Industry Classification System (NAICS) in 1999. Such classification artifacts were of particular concern for companies traditionally thought of as pharmaceutical or computer-manufacturing firms. As these firms increasingly marketed their own products and more of their payroll involved employees in selling and distribution activities, the potential for the companies to be classified among the wholesale trade industries increased. To maintain the relevance and usefulness of the industrial R&D statistics, NSF evaluated ways to ameliorate the negative effects of the industry classification methodology and change in classification systems. In addition to firms originally assigned NAICS codes among the wholesale trade (NAICS 42) industries, firms in the information services (NAICS 51); professional, scientific, and technical services (NAICS 54); and management of companies and enterprises (NAICS 55) industries using the payroll-based methodology were manually reviewed by NSF and Census. These firms were reclassified based on primary R&D activity, which in most cases corresponded to their primary products or service activities. The result was that most of the R&D previously attributed to NAICS 42 and 55 industries was redistributed. Statistics resulting from the old and new industry classification methods were published in tables A-9 and A-10 in Research and Development in Industry: 2004 (NSF 2009). For detailed information, also see NSF 2007c.
Large Year-to-Year Changes
Large year-to-year changes may occur because of the way industry classifications are assigned during statistical processing. A company's industry classification is a function of its primary activity based on payroll, which is not necessarily the primary source of its R&D activity for those companies not manually reviewed as described in "Industry Classification," above. For the companies not manually reviewed, if the largest portion of a company's payroll shifts to an activity other than an R&D-related activity, the classification of its R&D similarly shifts to the new activity. Further, the design of the statistical sample sometimes contributes to large year-to-year changes in industry estimates. Because relatively few companies perform R&D and there is no national register of industrial R&D performers, a large statistical "net" must be cast to capture new R&D performers. When these companies are sampled for the first time, they are often given weights much higher than they would be given if their size and the amount of R&D they perform were known at the time of sampling. After the size of the company and the amount of R&D performed are discovered via the first survey, the weight assigned for subsequent surveys is adjusted. This capture and weighting adjustment process can produce large year-to-year changes in the statistical series twice, when the company is first captured and data are overstated by the application of a large weight, and then when the weight is reduced. This process affects lower-level statistics (i.e., detailed industry and company size categories) the most because at the aggregate levels (i.e., all industries, manufacturing, nonmanufacturing), large year-to-year increases in some industries or in some company size categories are offset by large decreases in others.
Nonresponse and Imputation
For various reasons, some firms did not choose to return the survey questionnaire (unit nonresponse) or returned it with one or more blank items (item nonresponse). (See "Imputation for Item Nonresponse" in appendix A for more information on the reasons for unit and item nonresponse.) Missing data for major data items were estimated using mathematical algorithms developed from industry comparisons, data from previous cycles of the survey, and other information. Therefore, the statistics in some table cells may be accompanied by the notation "S," which indicates that the imputation rate—the percentage of the statistic not reported by respondents and consequently estimated—exceeds 50% for that item. In such cases, the estimate may be statistically unreliable. (See table A-5 for imputation rates for specific items.)
Percentages were calculated on the basis of thousands of dollars and may differ slightly from those calculated using the rounded figures shown.
The particular sample selected was one of a large number of samples of the same type and size that by chance might have been selected. Statistics resulting from the different samples would differ somewhat from each other. These differences are represented by estimates of sampling error or variance. The smaller the sampling error, the less variable the statistic. The accuracy of the estimate, that is, how close it is to the true value, is also a function of nonsampling error. One cannot use the statistics as exact point estimates because they are based on statistical samples subject to variability within and between years and because of the capture and weighting issues discussed above under "Large Year-to-Year Changes."
Because there is no master file of R&D performers, the levels at which the statistics are published are determined using information from various sources. First, the frame used to construct the statistical sample (see "Frame Creation and Industry Classification" in appendix A) is queried to find out which industries perform R&D. Indicators of R&D performance can be obtained from several Census surveys (e.g., the Company Organization Survey (COS), the Annual Capital Expenditures Survey (Aces), and the Economic Census). Once flagged, those companies are included in the Survey of Industrial Research and Development. Tabulations of levels and types of R&D at the finest industry level are prepared and subjected to disclosure routines, and the industries that remain without inordinately high numbers of suppressions become the industries for which statistics are published.
The basic reporting unit was the company, firm, or enterprise that included all establishments under common ownership or control. All R&D expenditures and all information about scientists and engineers of each company were classified into a single NAICS code and size category (see "Industry Classification" and "Company Size," above).
Because of rounding, detail items may not add to totals. Most money amounts are expressed in millions of dollars and are rounded down if less than $500,000 or up if $500,000 or more. Frequency estimates (e.g., number of companies) are accumulated from decimal weights assigned to company records and are rounded down if less than 0.5 and rounded up if 0.5 or greater (see "Weighting, Maximum Weights, and Probabilities of Selection" in appendix A for information on how company records are weighted). Most employment counts (e.g., number of scientists and engineers) are expressed in thousands and are rounded down if less than 500 or up if 500 or greater.
When a numerical value is accumulated from the statistical file to estimate a money amount, number of companies, number of employees, or number of R&D scientists and engineers, and the accumulated sum is zero, the cell is filled with "0" or "0.0." when a value is rounded to zero, the cell is filled with "*." When a percentage is calculated from the statistical file and the percentage equals zero, the cell is filled with 0.0; when it rounds to zero, it is filled with "*."
 The survey collects data on the amount of R&D funded by companies but performed by outside entities, including universities, colleges, and other nonprofit organizations. Resulting statistics are in table 12. More comprehensive data on R&D performed at universities and colleges are collected in NSF's annual Survey of Research and Development Expenditures at Universities and Colleges. More information about this survey is available at http://www.nsf.gov/statistics/rdexpenditures/.
 In the Survey of Industrial Research and Development and in the publications presenting statistics resulting from the survey, the terms firm, company, and enterprise are used interchangeably. Industry refers to the 2-, 3-, or 4-digit North American Industry Classification System (NAICS) codes or group of NAICS codes used to publish statistics resulting from the survey.