This document has been archived.

Arabidopsis thaliana Genome Sequencing Project | Background | Purpose | Introduction | Who May Submit | Principal Investigator and Other Senior Staff | Award | Proposal Submission | When and Where to Submit | Instructions for Submission of Arabidopsis Thaliana Genome Sequencing Project Proposals Using NSF FastLane | Evaluation of Proposals | Award Administration | Other Information


Interagency Program Announcement
NSF/DOE/USDA Joint Program

Arabidopsis thaliana Genome Sequencing Project

National Science Foundation
Department of Energy
U. S. Department of Agriculture

Deadline Date: April 15, 1998

Plant Genome Research Program

Interagency Program Announcement
NSF/DOE/USDA Joint Program

Arabidopsis thaliana Genome Sequencing Project

The purpose of this program announcement is to solicit proposals to continue the systematic sequencing of the genome of Arabidopsis thaliana. The ultimate goal of this project is to sequence the entire Arabidopsis genome within a reasonable time frame, maximally by the year 2000. This program represents a component of the new NSF Plant Genome Research Program. The overall objective of the Plant Genome Research Program is to support research on the structure, organization and function of plant genomes, and to accelerate the acquisition and utilization of new knowledge and innovative technologies that will aid in developing a more complete elucidation of basic biological processes in plants.


Rapid, automated sequencing technologies, with related computational advances in analysis and informatics, have transformed the nature of biological research. Complete DNA sequences already exist for several bacteria and for brewer's yeast, and within a few years the entire genome of the flowering plant Arabidopsis thaliana will be known. These huge amounts of data challenge the research and development community to understand and utilize this new knowledge effectively. One challenge, among many, is the revelation from complete sequencing of bacteria and yeast that nearly one-third of all the putative genes have no known function. Much about the structure, organization, and function of genomes remains uncharted.

In recognition of this unprecedented scientific opportunity, the National Science and Technology Council, at the request of Congress, established an Interagency Working Group on Plant Genomes (IWGPG) in May 1997. Representatives from the National Science Foundation (NSF), the Department of Energy (DOE), the United States Department of Agriculture (USDA) and the National Institutes of Health (NIH) were charged with developing a long-range, science-based plan for U.S. plant genome research. Subsequently, Congress appropriated funds to the NSF for Fiscal Year (FY) 1998 for "a comprehensive, peer-reviewed plant genome research program." It was further specified that the new NSF program should support research consistent with the IWGPG recommendations. In a preliminary report issued in June 1997, the IWGPG identified the following genome-based goals for advancing plant science with relevance to economically important plants: to support research and technology development in functional genomics; to develop the necessary physical and educational infrastructure to meet the needs of plant genomic research, including informatics tools and publicly available databases for Expressed Sequence Tags [ESTs] for major classes of economically important plants; in coordination with on-going international efforts, to initiate genomic studies of rice as a model plant for crop species in the grass family, such as corn, wheat and sorghum; and to accelerate efforts to complete the sequencing of the genome of the model flowering plant, Arabidopsis.


The purpose of this program announcement is to solicit proposals to continue at an accelerated rate the systematic sequencing of the genome of Arabidopsis thaliana. The ultimate goal of this project is to sequence the entire Arabidopsis genome within a reasonable time frame, maximally by the year 2000. It is anticipated that two to four three-year awards will be made in FY 1998, contingent upon the quality of proposals received and the availability of funds. Individual awards are expected to range from $2.5M to $6M per year.


The Multinational Coordinated Arabidopsis thaliana Genome Research Project was established in 1990 to develop Arabidopsis thaliana as an experimental model system for flowering plants. Three groups were selected to initiate the genome sequencing project in 1996. The goals for numbers of bases sequenced which were initially stated by those groups have been met or exceeded. Given the rapid advances in the sequencing efforts and mapping technologies, the continued advances in research and genomic tools for both Arabidopsis and other model organisms and the expanded insights into the biology of the plant, it is now estimated that sequencing of the entire Arabidopsis genome could be completed by the year 2000. This time frame is based on current genome sequencing technology, the rate of sequencing of the plant genome to date and the available resources.

Recognizing the potential of an Arabidopsis genome sequencing effort to contribute to their missions, the Department of Energy (DOE), and the United States Department of Agriculture (USDA) have joined with the National Science Foundation to continue the U.S. Arabidopsis thaliana Genome Sequencing Project. This project is coordinated with other ongoing U.S. genome projects, including the human genome research project supported by the National Institutes of Health (NIH) and the DOE, the microbial genome project supported by the DOE and the plant genome project supported by the USDA, in order to minimize duplication of effort and to maximize efficient use of available resources. U.S. efforts to complete the sequence of the Arabidopsis genome will continue to be coordinated on an international level with other national and transnational programs.


Proposals are solicited from a broad community of scientists at U.S. institutions. Consortia of eligible individuals or organizations may apply, but a single organization must accept overall management responsibility. The eligibility of applicants will be determined according to the guidelines in the "Grant Proposal Guide" (GPG), NSF 98-2, Chapter 1, Section D. The GPG is available on the NSF web site at the URL Proposals from the existing groups for continuation of their sequencing efforts are both expected and encouraged. Involvement of international collaborators is encouraged, although primary support for foreign participants/activities must be secured through their own national programs.


The Principal Investigator (PI) and other senior staff responsible for the project are expected to have expertise and experience in large scale, high through-put, genomic DNA sequencing. If the application is submitted by a consortium of several groups from one or more institutions, the consortium must make a convincing case that it can function in an effective, efficient, timely and cost-conscious manner.


The program anticipates supporting two to four three-year awards made as cooperative agreements. The exact amount of the award will depend on the advice of reviewers and on the availability of funds. The cost of the awards will be shared by the participating Federal agencies. It is anticipated that up to $10 million per year will be made available for this program beginning in FY 1998.


All participating agencies have agreed to use the NSF general guidelines as described in the GPG, NSF 98-2 and the NSF Proposal Forms Kit, NSF 98-3. The GPG, NSF 98-2, is available on the NSF web site at the URL The NSF Proposal Forms Kit, NSF 98-3 is available on the NSF web site at the URL

Proposals should be prepared following the GPG guidelines and the instructions below. Where the guidelines conflict between the GPG and this program announcement, the latter supersedes the GPG.

Each proposal must contain the following elements in the order indicated:

  1. NSF Cover Page (NSF Form 1207). Clearly indicate that the proposal is submitted for consideration by the Arabidopsis Program in response to this program announcement (NSF 98-52) in the appropriate box.
  1. Project Summary.
  1. Table of Contents (NSF Form 1359).
  1. Project Description. A description of the project must not exceed 25 pages inclusive of tables, diagrams and other visual material. The following points must be addressed in this section and integrated into the general format described in GPG, NSF 98-2.
      1. DNA substrates to be sequenced: Include source of the DNA (clones), map of the chromosomal region involved, the method of preparation and all other pertinent information. The selection strategies proposed must be applicable to efforts to sequence the entire Arabidopsis genome by the year 2000, and be justified on that basis.

      2. Sequence quality and quantity: The level of accuracy to be sought and how that will be measured as well as the numerical goal should be discussed. The numerical goal, defined as the number of bases to be sequenced per unit time, should be linked with a discussion of the finishing process and how that will be defined. Will gaps be closed? Will annotation be completed and what is meant by "complete annotation"? It is expected that the combined international effort will result in a complete sequence of the Arabidopsis genome at a minimal accuracy level of 1/10,000 bases by the year 2000. The submitted proposal should indicate the anticipated contribution to that overall goal.
      3. Genome sequencing technologies and strategies: Technologies/strategies that will be used should be described as well as plans for incorporating new developments and/or improvements in sequencing protocols, strategies and technologies as they become available. What is the goal for the applicant and what is the expected rate of sequencing which will achieve that goal? How will the applicants coordinate their efforts with other groups, both U.S. and international groups, sequencing other regions of the Arabidopsis genome?
      4. Costs of production sequencing in relation to the product proposed: The cost-effectiveness of the sequences generated will be a very important issue. An estimate of the dollars required to produce a specific number of bases (which should include the costs of generating clones, assembly and annotation) should be given. If investigators are proposing a strategy that will yield less than the complete genome sequence, they must provide an overall vision of how this strategy will contribute to the cost-effective completion of the entire Arabidopsis genome by the year 2000 or earlier.
      1. Plans for establishing coordination with other existing or planned Arabidopsis sequencing projects, both nationally and internationally.
      2. Plans for establishing a close linkage to the plant biology research community at large in order to ensure a close collaboration between the sequencing project and the ultimate user community of the sequence information.
      3. Ways to assess progress of the project, including establishing milestones and measuring progress toward them. A common advisory committee will be appointed based upon suggestions from all of the participants, including the agencies, which will serve as a means of advising all participants of problems or solutions which will benefit all of the participants. Details will be included in the Cooperative Agreements and will be agreed to by all participants prior to activation of the Cooperative Agreements.
      4. Available facilities including a statement of institutional commitment for the successful completion of the project.
      1. Data management plan, including: (1) mechanisms to assess validity and accuracy of data obtained which will augment or complement procedures to monitor accuracy which may be mandated by the agencies; (2) mechanisms for annotation of data and release of both raw and finished data into public databases -- creative, cost-effective strategies for annotating sequences are encouraged; and (3) community access to data mechanisms of data distribution and interactions with other community databases.
      2. Data release policies including how rapidly sequence data will be publicly released after production. The sponsoring agencies encourage an immediate and continuous release of clearly labeled raw as well as finished genome sequence data as it is obtained by the groups supported by this Program.
      3. A statement signed by an authorized institutional official should be included which clearly describes the institutional policy for sharing information and materials resulting from this work with other researchers of the community of scientists.
  1. References cited. Citations must be complete and arranged in alphabetical order by author.
  1. Budget (NSF Form 1030). Provide a budget for each year of support requested as well as a summary budget for all three years. If there are subcontracts, signed separate budget(s) must be included for each subcontractor. Funds for facility construction or renovation may not be requested.
  1. Budget justification. A brief explanation for funds in each budget category should be provided. For major equipment or software materials, a particular model or source and the current or expected price should be specified whenever possible. A brief explanation of the need for each item whose costs exceeds $10,000 should be provided. This section should also include a summary of institutional cost sharing, if any, and other sources of support for the project, such as government, industry, or private foundations. Appropriate documentation of any such commitments should be provided in the appendix. Although cost-sharing is not required, any such commitment specified in the proposal will be made a condition of an award, and the dollar amount of the proposed cost sharing must be indicated on line M of the Budget Form 1030.
  1. Facilities and Equipment and Other Resources (NSF Form 1363). Include a brief description of available facilities, including space and relevant equipment available for the project. Where requested equipment or materials duplicate existing items, explain the need for duplication. This section is limited to 2 pages.
  1. Biographical Sketches. For each of the key personnel, provide a curriculum vitae or short biographical sketch. Briefly describe relevant experience, list up to 10 publications (to include the individual's 5 most important and up to 5 other, relevant publications). The section is limited to 2 pages for each individual.
  1. Current and Pending Support (NSF Form 1239). Provide a complete list of current and pending research support for each of the key personnel.
  1. Appendices. Those individuals, institutions, or programs that are participating in the project in a significant way, but have not endorsed the cover page as official co-PIs, must submit letters describing the nature of their collaboration and commitment to the proposed project. They can be included as appendices. General letters of endorsement may not be included.
  1. Additional Information. The following items must be provided:

Attach these items to the copy of the proposal that bears the original signatures, with the Form 1225 and the lists of collaborators on top. These items are for the agency's internal use only and will not be available to reviewers. Do not provide additional copies of these items with the other proposal copies.


Proposals must be received at NSF by 5:00 p.m. Eastern Time (ET) April 15, 1998.

You are encouraged to use NSF FastLane to prepare and submit your Arabidopsis thaliana Genome Sequencing Project. To access FastLane, go to the NSF Web Site at the URL, then select "FastLane" or go directly to the FastLane Home page located at

For proposals not submitted via FastLane the completed application, including the original proposal and 18 copies should be sent to (see the GPG, Chapter I, Section E):


    1. Progress Report
    2. Sequencing Strategies
    3. Project Management
    4. Information Management
    5. List of current and past collaborators (Note: This list does not count against the 25 page limit of the Project Description.)


Selection of awards will be based on merit review by experts using established peer review systems as described in GPG, Chapter III. A special emphasis panel will be formed to review the applications and site visits may be used as needed. All participating agencies use similar general criteria in the evaluation of proposals submitted to their respective competitive research grants programs. The new NSF review criteria as outlined in the GPG (NSF 98-2) will be interpreted in light of the objective of this solicitation as follows:

1. Performance competence: This criterion addresses the technical soundness of the proposed approach, the capabilities of the proposed personnel, including those of the PI and other senior staff as discussed above, the adequacy of the resources available or proposed, and the likelihood that this project will lead to a successful, timely, cost-effective completion of Arabidopsis genome sequencing by the year 2000.

2. Project management: This criterion addresses the overall quality of the technical and managerial aspects of the proposal, including plans for the release of the data and the sharing of the information and resources resulting from the project to the scientific community as noted below, and for management oversight and long-range planning.

3. Effect of the activity on the scientific infrastructure: This criterion addresses the potential of the proposed activity to contribute to better understanding or improvement of the quality and effectiveness of the Nation's scientific research, education, and human resources capabilities. An important issue is a likelihood of national impact and widespread, appropriate dissemination and use of results in strengthening the scientific infrastructure of this nation.

4. Scientific collaboration and information sharing: Sequencing of the genome of a model organism is a community activity. As such, a close collaboration among the scientists and organizations involved in sequencing activities and effective dissemination to the users of the information (the scientific community) are important components of this criterion.


The Arabidopsis thaliana Genome Sequencing Project will be administered and managed as an interagency program involving all participating agencies throughout the entire process from the development of the program announcement to the review, selection and administration of the awards. Awards will be administered in accordance with the terms and conditions of NSF GC-l, "Grant General Conditions," (12/97) and NSF CA-1, "Cooperative Agreements General Conditions" (12/95). This information can be obtained from the NSF OnLine Document System ( Copies of these documents are available at no cost from the NSF Clearinghouse, P.O. Box 218, Jessup, MD 20794-0218, phone (301) 947-2722, or via e-mail at pubs@nsf gov. More comprehensive information is contained in the NSF Grant Policy Manual (NSF 95-26), for sale through the Superintendent of Documents, Government Printing Office (GPO), Washington, D.C. 20402. The telephone number at GPO is (202) 783-3238 for subscription information. The NSF Grant Policy Manual can also be accessed online at the above URL.


Potential applicants are strongly encouraged to contact project officers and discuss their plans. Inquiries regarding the announcement can be directed to any one of the following agency representatives:

The Foundation provides awards for research in the sciences and engineering. The awardee is wholly responsible for the conduct of such research and preparation of the results for publication. The Foundation, therefore, does not assume responsibility for such findings or their interpretation.

The Foundation welcomes proposals on behalf of all qualified scientists and engineers, and strongly encourages women, minorities, and persons with disabilities to compete fully in any of the research and research-related Programs described in this document. In accordance with Federal statutes and regulations and NSF policies, no person on grounds of race, color, age, sex, national origin, or disability shall be excluded from participation in, denied the benefits of, or be subject to discrimination under any program or activity receiving financial assistance from the National Science Foundation.

Facilitation Awards for Scientists and Engineers with Disabilities provides funding for special assistance or equipment to enable persons with disabilities (investigators and other staff; including student research assistants) to work on NSF projects. See the program announcement or contact the program coordinator at (703) 306-1636.

Privacy Act. The information requested on proposal forms is solicited under the authority of the National Science Foundation Act of 1950, as amended. It will be used in connection with the selection of qualified proposals and may be disclosed to qualified reviewers and staff assistants as part of the review process; to applicant institutions/grantees; to provide or obtain data regarding the application review process, award decisions or the administration of awards; to government contractors, experts, volunteers and researchers as necessary to complete assigned work; and to other government agencies in order to coordinate programs. See Systems of Records, NSF 50, "Principal Investigators/Proposal File and Associated Records," and NSF 51, 60 Federal Register 4449 (January 23, 1995), "Reviewer/Proposal File and Associated Records," 59 Federal Register 8031 (February 17, 1994).

Public Burden Statement. Submission of the information is voluntary. Failure to provide full and complete information, however, may reduce the possibility of your receiving an award.

The public reporting burden for this collection of information is estimated to average 120 hours per response, including the time for reviewing instructions. Send comments regarding this burden estimate or any other aspect of this collection of information, including suggestions for reducing this burden, to Gail A. McHenry, Reports Clearance Officer, Information Dissemination Branch, National Science Foundation, 4201 Wilson Boulevard, Suite 245, Arlington, VA 22230.

NSF has TDD (Telephonic Device for the Deaf) capability which enables individuals with hearing impairments to communicate with the Foundation about NSF programs, employment, or general information. To access NSF TDD, dial (703) 306-0090; for FIRS,


The program described in this publication is in the Catalog of Federal Domestic Assistance:

OMB# 3145-0058
P.T. 34
K.W. 1002037

NSF 98-52 (Electronic Dissemination Only)
(Replaces NSF 95-159)