Data Archiving Policy
The National Science Foundation is committed to the principle that the various forms of data collected with public funds belong in the public domain. Therefore, the Division of Social and Economic Sciences has formulated a policy to facilitate the process of making data that has been collected with NSF support available to other researchers.
The purpose of this policy is to advance science by encouraging data sharing among researchers. Data sharing strengthens our collective capacity to meet scientific standards of openness by providing opportunities for further analysis, replication, verification and refinement of research findings. These opportunities enhance the development of fields of research and support the potential for cross-directorate activity. In addition, the greater availability of research data will contribute to improved training for graduate and undergraduate students, and make possible significant economies of scale through the secondary analysis of extant data. Finally, researchers have a special obligation to scientific openness and accountability when the research is publicly funded.
SES supports a wide range of disciplines. Therefore, the nature of the data, the way they are collected, analyzed, and stored, and the pace at which this reasonably occurs vary widely. There are different storage facilities and different access requirements for, e.g., large-scale survey data, oral interviews with scientists and other subjects, and data generated by experimental research. Grantees from all fields will develop and submit specific plans to share materials collected with NSF support, except where this is inappropriate or impossible. These plans should cover how and where these materials will be stored at reasonable cost, and how access will be provided to other researchers, generally at their cost.
This policy explicitly recognizes that many complexities arise across the range of data collection supported by SES programs, and that unusual circumstances may require modifications or even full exemptions. For example, human subjects protection requires removing identifiers, which may be prohibitively expensive or render the data meaningless in research that relies heavily on extensive in-depth interviews. Intellectual property rights may be at risk in some forms of data collection. The policy is intended to be flexible enough to accommodate the variety of scientific enterprises that constitute SES programs. No comprehensive set of rules is possible, but the procedures indicated below are designed to provide guidance for broad categories of data collection.
Guidelines for Categories of Data
Quantitative Social and Economic Data Sets
For appropriate data sets, researchers should be prepared to place their data in fully cleaned and documented form in a data archive or library within one year after the expiration of an award. Before an award is made, investigators will be asked to specify in writing where they plan to deposit their data set(s). This may be the Inter-University Consortium for Politicaland Social Research (ICPSR) at the University of Michigan, but other public archives are also available. The investigator should consult with the program officer about the most appropriate archive for any particular data set.
The kinds of qualitative information collected in research projects supported by SES can range from microfilms and other copies of very old documents to oral interviews and video tapes about historical events in science or about contemporary technological controversies. They can consist of hand written records of open-ended interviews. Investigators should consider whether and how they can develop special arrangements to keep or store these materials so that others can use them. If it is appropriate for other researchers to have access to them, the investigators should specify a time at which they will be made generally available, in an appropriate form and at a reasonable cost.
In experimental research, individuals, be they people, animals, or objects, are subjected to preplanned conditions and their responses tabulated in some fashion. Investigators should plan to make these tabulated data available to other investigators requesting them, at a minimum along the lines suggested by Geoffrey Loftus in his editorial in the January, 1993, issue of Memory and Cognition. In addition, complete information on how an experiment was conducted and any unusual stimulus materials should be made available, so that failures to replicate will not turn out to depend on one scientist's incomplete understanding of another's procedure. SES will work with the research community to identify and resolve problems with developing and establishing centralized archives.
Mathematical and Computer Models
Often in the course of conducting research, investigators develop mathematical and computer models, either as an innovative aid in the analysis of data or as a theoretical statement about the processes involved in generating some classes of data. Investigators should plan to make these models available to others wanting to apply them to other data sets or experimental situations. In some cases, the descriptions in published articles are sufficient; more often, it will be necessary for investigators to prepare fully documented and robust versions of these models, typically on disk, so that they can be provided to others. SES will work with the research community to identify and resolve problems with developing and estasblishing centralized archives for these models.