Advancing Measures of Innovation: Knowledge Flows, Business Metrics, and Measurement Strategies
Advancing Measures of Innovation
Two key questions were raised at the outset of the workshop.
Which indicators are most urgent?—In answer to this question, a number of indicators were mentioned as being urgently needed, although no consensus was established on priorities among them.
Which indicators are most immediately feasible?—Workshop participants emphasized that measuring innovation poses difficult problems but is important to do. The workshop discussion did not produce an explicit assessment of which indicators are most immediately feasible. However, there was a sense that additional analysis of existing indicators and linking existing indicators were more immediately feasible than developing indicators that depend on developing new or revised surveys or datasets.
Several presenters described new or little-known innovation-related data and research. These are grouped into five categories, listed below.
1. Scientific and Technical (S&T) Employment
S&T employment statistics are important because they measure the human resource input to R&D and innovation. The importance of human resources to innovation is increasingly recognized. The Bureau of Labor Statistics (BLS) has employment data broken out by occupation and by industry. There is a sense that these data have not been used as much as they could be in the study of innovation, as discussed by one workshop participant.
As noted by another presenter, much of the knowledge produced by R&D is "wrapped up" in individuals and moves with them. This presenter had worked with the NSF Survey of Earned Doctorates (SED) to code the industrial placement information on the SED, which had never been done before. She noted that human resource data may complement R&D expenditure data in some ways, showing either more or less innovation activity than might be shown by R&D expenditures under different circumstances. She stated that human resource data could be even more useful if they included the following enhancements:
2. International Economic Data
It may not be widely recognized that the international data collected by the Bureau of Economic Analysis (BEA) include innovation-related data. As innovation activity becomes increasingly diffused around the globe, international data become more important to understanding the U.S. position. One speaker highlighted several series of innovation-related statistics collected by BEA, as follows:
Another speaker commented that it is interesting that data exist for international commercial transactions among MNCs but not for domestic transactions. Some private companies reportedly have this kind of data, but the data are of uncertain quality.
3. Federal Trade Commission Database on R&D
The Federal Trade Commission at one time published a line-of-business database that had R&D data broken down by SIC (standard industrial classification code). These data exist for 1974–77 and have been used in research by some of the workshop participants.
4. Industrial Research Institute R&D Survey
The Industrial Research Institute surveyed R&D at member firms from 1991 to 1999. Data were collected at both the firm level and the line-of-business level. Data were included for some output variables, such as patents, new sales ratio (revenues realized this year from new products introduced in the last 5 years divided by total revenues realized this year), and cost savings realized (cost savings realized this year from process improvements made in the last 5 years divided by gross profits realized this year). There are 27 directly measured metrics. In addition, 16 computed metrics can be derived and 10 more metrics can be obtained through clustering. The results of this survey were reported annually in Research-Technology Management between 1993 and 1999. The data file is maintained and available through the Center for Innovation Management Studies at North Carolina State University.
5. University-Industry Knowledge Flows
One speaker provided a list of data sets used in research on university-industry knowledge flows, noting that the burgeoning literature on this topic is highly interdisciplinary, uses proprietary databases and a wide variety of performance indicators, makes use of both quantitative and qualitative methods, and performs analyses at numerous levels of aggregation. Data sources include the following:
One speaker further noted that many of the proprietary data sets cover embryonic industries, such as the nanotechnology industry, whereas official statistics tend to be collected after a new phenomenon is well established. He indicated that the Yale and Carnegie Mellon surveys of R&D managers are some of the more creative surveys and suggested that this is because researchers were involved in their design. Those surveys may provide a model for other data-collection efforts.
The workshop participants identified a number of data needs focused on different aspects of and results from the innovation process. These are grouped in 11 categories, listed below.
1. Innovative Activities
These activities include R&D, R&D support, and the steps that need to be taken between R&D and the introduction of a new product or concept into the market or into large-scale use. These steps may include pilot plant and start-up manufacturing.
2. Key Drivers, Inputs, and Institutional Mechanisms
Data needs associated with this topic include the following:
3. Outputs and Outcomes of Innovation
Outputs are the immediate results of innovation, such as new and improved products, processes, services, business models, and business practices. Some outputs, such as publications and patents, are intermediate outputs and may be inputs at later stages of the process. Outcomes refer to the impacts (positive and sometimes negative) of innovation. These include the following:
These outcomes are in part behind the rationale for government to support innovation and technology. The difficulty of measuring outcomes and attributing them to investments in innovation, due to long time lags for example, was also discussed.
4. Effects of Government Policies on Innovation
Data needs associated with this topic include the following:
5. Relationships, Knowledge Flows, and Networks
Data needs associated with this topic include the following:
6. Accounting for Innovation and Its Relationship to Finances
Data needs associated with this topic include the following:
7. Adoption and Diffusion of Innovations
The adoption and diffusion of innovation need to be given much more prominence. Diffusion is really important because that is how most returns to innovation are realized, and particularly at the micro level, this has not been well studied. Attention should be focused not just on technology diffusion but also on diffusion of new practices, such as the broadband Internet or evidence-based medicine. Broad surveys may be too ambitious, but specialized surveys or case studies may be appropriate.
8. Mobility of Individual Scientists and Graduate Students
The phenomenon of midcareer mobility will become a bigger issue for the United States. And it is important to pay attention to the mobility of individual scientists and graduate students. Human resource data illuminate patterns of innovation not emphasized by R&D data. R&D data are generally characterized by the following:
9. Intangibles and Disembodied Knowledge
Discussions made it clear that knowledge should be thought of as both embodied (e.g., new goods) and disembodied (e.g., scientific publications), and both need to be tracked. Intangibles could be tracked in such areas as services and new business practices. It was suggested that NSF identify a few industries or sectors and fund specific studies to develop metrics on this broader notion of innovation.
10. University-Industry Knowledge Flows
Many observers believe that the relationship between universities and industry is an important source of the U.S. advantage in innovation. Data needs on this relationship include the following:
11. Data Needed to Support the R&D Satellite Account
NSF is funding the development of a BEA/NSF R&D Satellite Account consistent with the methodology of the U.S. National Income and Product Accounts. This project has identified key data needs, including capital expenditures and compensation cost details for scientists and engineers and support personnel.
Much of the discussion at the workshop was focused on specific methods or approaches to developing data on innovation. The sense of the workshop was that the diverse strategies are not mutually exclusive and can be productively pursued in parallel or in combination. Furthermore, multiple data sources may be mined and integrated to yield additional indicators.
Several presenters stressed the need to support multiple measures of the same phenomenon whose errors are not correlated. The characteristics of a good proxy measure include high signal to noise ratio, unbiased errors, and a relationship between the proxy and the phenomenon that is linear (or understood) and stable over time and across different settings.
The approaches receiving the most attention were survey-based measures, data linking, nonsurvey-based measures (e.g., administrative data mining), and case studies and qualitative data.
As one speaker observed, the sample survey has been the method most commonly used to collect data for innovation-related indicators. The advantages of surveys are that they can be designed for consistent interpretation of questions, and thus permit comparisons among respondents. On the other hand, structured surveys reduce the flexibility of responses, potentially omitting important details and nuances. Among some survey populations, such as small firms, low response rates can limit the representativeness of response data.
Another speaker noted that the European Community Innovation Survey (CIS) has driven the development of international guidelines for collecting and interpreting innovation data, as defined in the OECD's Oslo Manual. The CIS has been conducted periodically since 1993, and similar innovation surveys are conducted by other countries, such as Australia, Canada, Japan, and the Russian Federation. The CIS is mandatory in some countries and voluntary in others, with the result that response rates differ markedly across countries. Little is known about who has used the data from the CIS surveys, what kinds of research and analysis they have done, or what impact they have had on policy.
Several participants observed that the United States does not have a comprehensive innovation survey similar to CIS. It was pointed out that the information from such a survey is fundamental to addressing questions of the health and vitality of the U.S. R&D system. Innovation is clearly a vital part of the picture. Without credible innovation indicators, it is difficult to demonstrate how investments in R&D lead to social and private benefits. There was strong support among some workshop participants for the recommendation in the 2005 CNSTAT report that NSF should resolve methodological issues related to collecting innovation-related data and initiate a regular and comprehensive program of measurement and research related to innovation.
However, it was also recognized that applying the CIS straightforwardly to the United States may not be appropriate, for such reasons as differences in statistical systems (e.g., centralized vs. decentralized structures) and statistical policy guidance (e.g., issues of respondent burden). NSF/SRS currently conducts a number of surveys that produce data related to innovation, including surveys of R&D expenditures and of human resources in science and technology. Although it does not conduct a separate, nationally representative survey of innovation, SRS has conducted some limited studies and surveys. As part of its industrial R&D recordkeeping study, SRS is asking about the ability of people in industry to answer questions on innovation beyond R&D inputs.
One strategy discussed was including innovation questions in the existing Survey of Industrial R&D (SIRD). An advantage to this strategy is that it would involve incremental modifications to a well-established survey. There was concern, however, that the people who respond to the R&D questions may not be able to respond to non-R&D innovation-related questions. There was some agreement that if the SIRD is broadened to include innovation, it must move beyond one simple survey instrument. Another concern was that the SIRD is a company-level survey, whereas a number of economic surveys are conducted at the establishment-level, which makes it difficult to link data.
Another strategy mentioned is to codevelop innovation-related questions in selected economic surveys, along with mining and integrating resulting data. Compared with a stand-alone innovation survey, this has the advantage of obtaining data automatically consistent with relevant supersets—consider, for example, total vs. innovation-related capital expenditures or revenues. Stand-alone innovation surveys may result in innovation data that are not methodologically consistent with related data.
NSF may also form public-private partnerships to proceed with smaller private experimental surveys in areas where consensus on which variables are important has yet to be established, while proceeding with a larger public survey that focuses on variables where consensus exists about their importance. For example, it was suggested that NSF might form partnerships with private-sector institutions that are collecting innovation-related data, such as the Association of University Technology Managers.
One of the key messages from the workshop was that there are a lot of fragmented data. One participant stated, "Patents, universities, people and human capital, internationalization—all are interlinked, but there are separate datasets and research communities for each."
Several presentations discussed the importance of identifying and linking existing data. In addition to needing new data with which to understand innovation, a number of participants recommended linking existing datasets, such as linking the National Bureau of Economic Research (NBER) U.S. patent data to the NSF/U.S. Bureau of the Census (Census) R&D survey. An ongoing NSF/Census/BEA project is linking the BEA data on U.S. direct investment abroad and foreign direct investment in the United States to the NSF/Census R&D data. One participant observed that if a survey is not linked to other surveys, one cannot follow through the R&D/innovation/diffusion/socio-economic benefits cycle. It was suggested that the Census Bureau, which conducts surveys for NSF and other sponsors, might do some arm-wrestling with sponsors and argue for more consistency among surveys in order to facilitate analysis and further data development.
The joint NSF/Census/BEA linking-feasibility study, completed in 2005, developed methodology necessary to link BEA data on MNCs with NSF/Census R&D data for all U.S. businesses. The link will facilitate integrated data covering domestic and international dimensions of R&D not available separately from the component surveys. The agencies linked U.S. MNCs parent data for 1999 and U.S. affiliate data for 1997 to all U.S. business data from the SIRD. The project also produced new preliminary data on basic research, applied research, and development for U.S. affiliates of foreign MNCs. This linking project also allowed sample and methodological improvements. Based on these positive results, the Census Bureau, NSF/SRS, and BEA are currently planning to conduct linking activities with more recent data.
It was suggested that Census, BLS, and NSF consider bringing together the BLS occupational data and the SIRD. In addition, data from SIRD, BLS data by standard metropolitan statistical area, and outside surveys of innovation could be linked. Although this might be technically feasible, whether it is appropriate to do so is a policy question. It might be appropriate to go to the Center for Economic Studies, a research unit of the Census Bureau established to encourage and support the analytic needs of researchers. It was also noted that BEA is planning to link MNC data to BLS occupational data because of policy concerns about the effects of offshoring on skills in U.S. firms.
Some speakers called for linking human resource data to productivity measures and linking innovation data to accounting structures.
Nonsurvey-Based Measures: Administrative Databases
As one speaker commented, the sample survey method is facing new challenges due to declining response rates. He suggested that utilization of administrative records may be the direction of the future. This approach relies on gathering information from collections of already existing data that were developed for some other purpose.
The speaker further suggested that integrating survey data with administrative data may solve some of the challenges facing innovation indicators. He gave an example of a study to determine the number of uninsured children at the county level. That study linked survey and demographic data sets to get information that was not contained in either set alone.
Data integration tools and algorithms are being improved, and they have significant potential to find new value in existing data. However, there are still gaps in the theory of how to integrate data. Among the principles suggested to guide data integration are the following:
Case Studies and Qualitative Measures
The strengths and weaknesses of case studies and qualitative measures were mentioned by several speakers. The case study method is especially useful in establishing causal paths, such as those between innovation and its socioeconomic impacts or between innovation drivers and innovation, and is often used in studies of innovation within the firm. One speaker noted that, at the micro level, smaller, more detailed studies tend to give more interesting and more informative results on how things work, but that it takes different kinds of methods to answer different kinds of questions. Moreover, as another speaker commented, the results of a case study cannot be generalized beyond the case itself and can be misleading. Thus, multiple case studies of a subject are often necessary but may result in a hodge-podge of different, incomparable kinds of data unless they are carefully designed and coordinated.
Speakers also commented on the need to combine quantitative and qualitative data in the study of innovation. As mentioned previously, one speaker observed that the burgeoning literature on university-industry knowledge flows uses both quantitative and qualitative data and methods, including case studies and event studies. Another speaker described the use of quantitative and qualitative data to identify and map regional nanotechnology assets and to assess a region's strengths and weaknesses in nanotechnology. This speaker observed that the choice of qualitative or quantitative data depends on the questions one is trying to answer. If a person can find data to answer the questions, he or she should use it, recognizing that it may take some transformation and it may not be perfect. If data are not available (for example, to learn what people's perception of nanotechnology is), other methods should be used. The speaker pointed out that non-quantitative knowledge—such as knowing where research is going on, who is doing it, and what its nature is—scan be extremely helpful in leveraging research and creating research synergies.
As one speaker noted "Trying to find a perfect innovation metric is like the search for the Holy Grail…what you should look for are multiple metrics with offsetting weaknesses."