Research and Development in Industry: 2006–07
This is the final report from the Survey of Industrial Research and Development, which was conducted annually and produced statistics for 1953–2007. Its successor is the Business R&D and Innovation Survey. During the production of this report, the America COMPETES Reauthorization Act of 2010 was signed into law. Section 505 of the bill renames the Division of Science Resources Statistics as the National Center for Science and Engineering Statistics (NCSES). The Center retains its reporting line to the Directorate for Social, Behavioral and Economic Sciences within the National Science Foundation. The new name signals the central role of NCSES in the collection, interpretation, analysis, and dissemination of objective data on the science and engineering enterprise.
This report contains two sets of statistics produced from the survey: statistics on R&D funding for calendar year 2006 and on R&D personnel in January 2007; and R&D funding for calendar year 2007 and on R&D personnel in January 2008. Among the tables are several that include statistics on trends in industrial R&D since 1953, statistics on employment by R&D-performing firms since 1997, and a table classified by state that contains statistics for selected years since 1997. This report also contains (in the technical notes in appendix A) information about the industry coding classification system, company size classifications, survey methodology, comparability of the statistics over time and with other statistical series, survey definitions, history of the survey, and other information designed to convey to the data user what the survey statistics represent and, in some cases more importantly, what they do not represent. Survey questionnaires, instructions, and other documents are reproduced in appendix B.
This report provides national estimates of the expenditures on R&D performed within the United States by industrial firms, whether U.S. or foreign owned. Among the statistics are estimates of total R&D, the portion of the total financed by the federal government, and the portion financed by the companies themselves or by other nonfederal sources, such as state and local governments or other industrial firms under contract or subcontract. Total R&D is also separated into the types of costs: wages and fringe benefits of R&D staff, materials and supplies, depreciation, and other costs. Other statistics include R&D financed by domestic firms but performed outside the 50 U.S. states and DC, R&D performed by organizations outside the firm, R&D performed in collaboration with other organizations, and the funds spent to perform energy-related R&D. Also, this report provides information on R&D-performing firms, including domestic net sales, number of employees, number of R&D-performing scientists and engineers, geographic location where the R&D was performed, and R&D funds spent per R&D-performing scientist and engineer.
The National Science Foundation Act of 1950, as amended, authorizes and directs the National Science Foundation (NSF) "to provide a central clearinghouse for the collection, interpretation, and analysis of data on scientific and engineering resources and to provide a source of information for policy formulation by other agencies of the Federal government." The Survey of Industrial Research and Development has been the vehicle with which NSF has carried out the industrial portion of this mandate, and NSF's Division of Science Resources Statistics—now the National Center for Science and Engineering Statistics—has sponsored and managed a survey of industrial R&D since 1953. The 1953–56 surveys were conducted by the Bureau of Labor Statistics (BLS) in the U.S. Department of Labor (NSF 1956, 1960). Since 1957, the Bureau of the Census in the U.S. Department of Commerce has conducted the survey. Data obtained in the earlier BLS surveys are not directly comparable with Census figures because of methodological and other differences. Census conducts the survey under Title 13 of the United States Code, which prohibits publication or release of data or statistics that may reveal information about individual companies. In some tables in this report, the symbol D is used to indicate that estimates are withheld to avoid possible disclosure of information about operations of individual companies.
The Survey of Industrial Research and Development was an annual sample survey that intended to include or represent all for-profit R&D-performing companies, either publicly or privately held. Respondents received detailed definitions to help them determine which expenses to include or exclude from the R&D data they provided. Nevertheless, the statistics presented in this report are subject to response and concept errors caused by differences in the way respondents interpret the definitions of R&D activities and by variations in company accounting procedures. The survey's primary focus was on U.S. industry as a performer of, rather than as a source of funds for, R&D. Thus, data collected on federal support of R&D activities performed by industry appear in several tables whereas only limited statistics on industrial funding of R&D undertaken at universities and colleges and other nonprofit organizations were collected.
As a result of collecting and publishing performer-reported statistics, the federally funded R&D performance totals presented in this report differ from the totals reported by the federal agencies that provide the funds and the statistics published in NSF's Federal Funds for Research and Development report series (http://www.nsf.gov/statistics/fedfunds/). One reason for these differences is that performers of R&D often expend federal funds in a year other than the one in which the federal government provides authorization, obligations, or outlays (see "Comparisons to Other Statistical Series" in appendix A for definitions of these terms). During the past decades, the differences have widened between the federal R&D funding reported by performers and that reported by funding agencies. These differences are documented and analyzed in the latest editions of the National Science Board's Science and Engineering Indicators (NSB 2010) (http://www.nsf.gov/statistics/seind10/) and NSF's National Patterns of R&D Resources (NSF 2008d) (http://www.nsf.gov/statistics/natlpatterns/) report series.
The content of the Survey of Industrial Research and Development was expanded and refined over the years in response to an increasing need by policymakers for more detailed information on the nation's R&D effort. For example, questions on energy R&D were added in the early 1970s, following that decade's oil shortage crisis. And, more recently, questions that probe companies' collaborative R&D activities and funding of international performance of R&D were added to keep up with the fast-changing environment of the conduct and organization of industrial R&D. On the other hand, collection of certain data items was eliminated in an attempt to alleviate some of the burden on respondents. For large firms known to perform R&D, a detailed survey questionnaire (Form RD-1) was used to collect data. To limit the reporting burden on small R&D performers and on firms included in the sample for the first time, an abbreviated survey questionnaire (Form RD-1A), which collected only the most crucial data, was used.
The changes that have been made to the survey throughout its history and some of the most recent are detailed in appendix A (see "Comparability of Statistics"). Specific changes are detailed in each of the annual reports resulting from the survey (http://www.nsf.gov/statistics/industry/).
Industry statistics in this report were developed from data collected from individual companies. Because the survey was company based rather than establishment based, all data collected for the various components of each company (plants, divisions, subdivisions, etc.) were tabulated in the company's major industrial classification, which was based on payroll. (See "Frame Creation and Industry Classification" in appendix A for more information about industry classification.) The resulting industry estimates were calculated by summing the data for companies classified within each major industry classification. National totals were then estimated by summing the industry estimates. The North American Industry Classification System (NAICS) was used to determine a company's major industrial classification, and the resulting statistics are published by NAICS code. For years prior to 1999, the Standard Industrial Classification (SIC) system was used. The development and ongoing refinement of NAICS has been a joint effort of statistical agencies in Canada, Mexico, and the United States. The system replaced the Standard Industrial Classification (1980) of Canada, the Mexican Classification of Activities and Products (1994), and SIC (1987) of the United States. (For a detailed comparison of NAICS to SIC of the United States, visit http://www.census.gov/eos/www/naics/.)
NAICS was designed to provide a production-oriented system under which economic units with similar production processes are classified in the same industry. NAICS was developed with special attention to classifications for new and emerging industries, service industries, and industries that produce advanced technologies. NAICS not only facilitates comparability of information about the economies of the three North American countries but potentially increases comparability with the two-digit level of the United Nations International Standard Industrial Classification (ISIC) system.
Availability of Survey Results
Detailed historical statistics for 1953–98 can be obtained from NSF's Industrial Research and Development Information System (IRIS) at http://www.nsf.gov/statistics/iris/, an online interface to the Survey of Industrial Research and Development Historical Database (SIRDHD) (NSF 2001b). The SIRDHD is a collection of more than 2,500 statistical tables containing all of the statistics produced and published from the 1953–98 cycles of the annual Survey of Industrial Research and Development. Statistics for 1991–2007 are available in the separate reports at http://www.nsf.gov/statistics/industry/.
Companies were categorized by total number of domestic employees. The survey excluded companies with fewer than five employees. The following are the size classes used in this report:
Current and Constant Dollars
Statistics in all tables are reported in current dollars. Constant dollars also are presented in tables 32, 55, 56, and 57. Gross domestic product (GDP) implicit price deflators were used to convert current to constant 2000 dollars.
Disclosure and Suppression of Statistics
Title 13 of the United States Code and a pledge of confidentiality to respondents prohibit publication or release of data or statistics that may reveal information about individual companies. Therefore, the data in some table cells have been suppressed and replaced with "D." This occurs when a small number of companies account for a large percentage of the estimate in a particular data cell. Although publication of certain cells may be withheld, the estimates in the cells are always included in totals. The tables most often affected by cell suppression are those that contain data on federal support for industrial R&D performance.
The statistics in this report cover only those operations located in the 50 U.S. states and the District of Columbia (DC). Statistics on company-sponsored R&D performed outside the 50 U.S. states and DC are included in tables 14, 15, 44, and 45 but are excluded from all other tables.
Beginning with 2001, the methodology to produce statistics by state was modified from previous years to address the recurring problem of large year-to-year variation in many state estimates. This variability was caused by many factors, including the potential inefficiency of the sample at state levels, the rarity of R&D expenditures, and the large weights often associated with companies that report R&D in the survey for the first time. Under the new methodology, a portion of the amount of R&D reported by some companies not selected for the sample with certainty is allocated (or raked) among all the states in which there was industrial activity. The new methodology was also applied retroactively to statistics for 1998–2000. In tables 25 and 59–61 statistics for 1998–2007 are flagged with an "e" if more than 50% of the estimate was imputed because of raking. Note that there was no change to the methodology for estimating the number of R&D performers in each state. This estimate continued to be calculated by summing the weights of the companies that actually reported R&D activity in a given state. For a more detailed explanation of the new methodology and the definition of a "certainty" company, see the technical notes.
The Survey of Industrial Research and Development was conducted annually from 1953 to 2007. Statistics for 1953–98 are reported by Standard Industrial Classification (SIC) code, and statistics for 1999–2007 are reported by North American Industry Classification System (NAICS) codes (see below). All of the statistics produced from the survey for 1953–98 are available in the Industrial Research and Development Information System at http://www.nsf.gov/statistics/iris/. An electronic database for post-1998 statistics has not been developed yet; however, annual reports for 1991–2007 are posted at http://www.nsf.gov/statistics/industry/ (NSF 1994b, 1995b, 1996c, 1997b, 1998b, 1999b, 2000b, 2002c, 2003a, 2005b, 2006c, 2007a, 2009a, 2010). Short reports that announce the availability of survey results and contain analytical information and highlight the expenditures for industrial R&D funded from companies' own resources and by the federal government also are available at http://www.nsf.gov/statistics/industry/ (NSF 1995a, 1996a, 1997a, 1998a, 1999a, 2000a, 2001a, 2002a, 2003b, 2004, 2005a, 2006a, 2007c, 2008a, 2009b).
Prior to the 1999 report, most historical tables classified by industry contained the current survey's statistics plus statistics for 10 previous years. Because of the conversion to NAICS and a change in the way industry codes are assigned during statistical processing (see below), tables that traditionally provided data by industry for one or more historical years now only show data for the study year.
During initial statistical processing, one North American Industry Classification System (NAICS) code was electronically assigned to each company. Multi-establishment companies were assigned single codes based on the most dominant aggregated activity for that firm in terms of total payroll. The 2002 version of NAICS was used for the 2006 and 2007 surveys and statistics for the following industries and industry groupings are published in this report:
Beginning with the 2004 survey and continuing for 2005, some companies' electronically assigned industry codes were manually examined and changed. These revised codes were used in the 2006 and 2007 surveys. Beginning in the late 1990s, increasingly large amounts of R&D were attributed to the wholesale trade industries, resulting from the payroll-based methodology used to assign industry classifications and the change from the Standard Industrial Classification (SIC) system to the North American Industry Classification System (NAICS) in 1999. Such classification artifacts were of particular concern for companies traditionally thought of as pharmaceutical or computer-manufacturing firms. As these firms increasingly marketed their own products and more of their payroll involved employees in selling and distribution activities, the potential for the companies to be classified among the wholesale trade industries increased. To maintain the relevance and usefulness of the industrial R&D statistics, NSF evaluated ways to ameliorate the negative effects of the industry classification methodology and change in classification systems. In addition to firms originally assigned NAICS codes among the wholesale trade (NAICS 42) industries, firms assigned to the scientific R&D services industry (NAICS 5417) and management of companies and enterprises (NAICS 55) industries using the payroll-based methodology were manually reviewed by NSF and Census. These firms were reclassified based on primary R&D activity, which in most cases corresponded to their primary products or service activities. The result was that most of the R&D previously attributed to NAICS 42 and 55 industries was redistributed. Statistics resulting from the old and new industry classification methods were published in tables A-9 and A-10 in Research and Development in Industry: 2004(NSF 2009a). For detailed information, also see NSF 2007d.
Large Year-to-Year Changes
Large year-to-year changes occurred because of the way industry classifications are assigned during statistical processing. A company's industry classification was a function of its primary activity based on payroll, which was not necessarily the primary source of its R&D activity for those companies not manually reviewed as described in "Industry Classification," above. For the companies not manually reviewed, if the largest portion of a company's payroll shifted to an activity other than an R&D-related activity, the classification of its R&D similarly shifted to the new activity. Further, the design of the statistical sample sometimes contributed to large year-to-year changes in industry estimates. Because relatively few companies performed R&D and there is no national register of industrial R&D performers, a large statistical "net" had to be cast to capture new R&D performers. When these companies were sampled for the first time, they were often given weights much higher than they would be given if their size and the amount of R&D they performed were known at the time of sampling. After the size of the company and the amount of R&D performed were discovered via the first survey, the weight assigned for subsequent surveys was adjusted. This capture and weighting adjustment process could produce large year-to-year changes in the statistical series twice, when the company was first captured and data were overstated by the application of a large weight, and then when the weight was reduced. This process affected lower level statistics (i.e., detailed industry and company size categories) the most because at the aggregate levels (i.e., all industries, manufacturing, nonmanufacturing) large year-to-year increases in some industries or in some company size categories were offset by large decreases in others.
Nonresponse and Imputation
For various reasons, some firms did not choose to return the survey questionnaire (unit nonresponse) or returned it with one or more blank items (item nonresponse). (See "Imputation for Item Nonresponse" in appendix A for more information on the reasons for unit and item nonresponse.) Missing data for major data items were estimated using mathematical algorithms developed from industry comparisons, data from previous cycles of the survey, and other information. Therefore, the statistics in some table cells may be accompanied by the notation "S," which indicates that the imputation rate—the percentage of the statistic not reported by respondents and consequently estimated—exceeds 50% for that item. In such cases, the estimate may be statistically unreliable. (See tables A-5 and A-12 for imputation rates for specific items.)
Percentages were calculated using dollars rounded to thousands and may differ slightly from percentages calculated using the figures shown in the tables, which are rounded to millions.
The particular sample selected was one of a large number of samples of the same type and size that by chance might have been selected. Statistics resulting from the different samples would differ somewhat from each other. These differences are represented by estimates of sampling error or variance. The smaller the sampling error, the less variable the statistic. The accuracy of the estimate, that is, how close it is to the true value, is also a function of nonsampling error. Due to these factors, one cannot use the statistics as exact point estimates because they are based on statistical samples subject to variability within and between years and because of the capture and weighting issues discussed above under "Large Year-to-Year Changes."
Because there is no master file of R&D performers, the levels at which the statistics are published are determined using information from various sources. First, the frame used to construct the statistical sample (see "Frame Creation and Industry Classification" in appendix A) was queried to find out which industries perform R&D. Indicators of R&D performance can be obtained from several Census surveys (e.g., the Company Organization Survey (COS), the Annual Capital Expenditures Survey (ACES), and the Economic Census). Once flagged, those companies were included in the Survey of Industrial Research and Development. Tabulations of levels and types of R&D at the finest industry level were prepared and subjected to disclosure routines to ensure that all sensitive data were suppressed, and the industries that remained without inordinately high numbers of suppressions became the industries for which statistics were published.
The basic reporting unit was the company, firm, or enterprise that included all establishments under common ownership or control. All R&D expenditures and all information about scientists and engineers of each company were classified into a single NAICS code and size category (see "Industry Classification" and "Company Size," above).
Because of rounding, detail items may not add to totals. Most money amounts are expressed in millions of dollars and are rounded down if less than $500,000 or up if $500,000 or more. Frequency estimates (e.g., number of companies) are accumulated from decimal weights assigned to company records and are rounded down if less than 0.5 and rounded up if 0.5 or greater (see "Weighting, Maximum Weights, and Probabilities of Selection" in appendix A for information on how company records are weighted). Most employment counts (e.g., number of scientists and engineers) are expressed in thousands and are rounded down if less than 500 or up if 500 or greater.
When a numerical value is accumulated from the statistical file to estimate a money amount, number of companies, number of employees, or number of R&D scientists and engineers, and the accumulated sum is zero, the cell is filled with "0" or "0.0." When a value is rounded to zero, the cell is filled with "*." When a percentage is calculated from the statistical file and the percentage equals zero, the cell is filled with 0.0; when it rounds to zero, it is filled with "*."
 Information about the new Business R&D and Innovation Survey is available from the National Center for Science and Engineering Statistics website at http://www.nsf.gov/statistics/srvyindustry/about/brdis/.
 The survey collected data on the amount of R&D funded by companies but performed by outside entities, including universities, colleges, and other nonprofit organizations. Resulting statistics are in tables 12 and 42. More comprehensive data on R&D performed at universities and colleges are collected in NSF's annual Survey of Research and Development Expenditures at Universities and Colleges. More information about that survey is available from NSF's National Center for Science and Engineering Statistics website at http://www.nsf.gov/statistics/rdexpenditures/.
 In the Survey of Industrial Research and Development and in the publications presenting statistics resulting from the survey, the terms firm, company, and enterprise are used interchangeably. Industry refers to the 2-, 3-, or 4-digit North American Industry Classification System (NAICS) codes or group of NAICS codes used to publish statistics resulting from the survey.