North American Arabidopsis Steering Committee Workshop Proposal For An Arabidopsis Thaliana Genome Project (ATGP)

Executive Summary

Barriers that once impeded the identification of genes with important biological functions are vanishing in the 1990's. High through-put genomic sequencing now permits the rapid identification of large numbers of genes that were previously inaccessible to traditional genetic analysis. The rate of gene discovery is now limited only by the ability of scientists to map and sequence an organism's genome. Initial sequencing efforts have focused on so-called model organisms with relatively small genomes. The information obtained from these model genomes is being used to understand gene structure and function in related organisms.

Arabidopsis thaliana, a small flowering plant in the crucifer family, has the smallest genome and the highest gene density so far identified in a flowering plant. During the past ten years, Arabidopsis has become established world-wide as the preferred species for molecular-genetic studies in the laboratory. Importantly, because cloned Arabidopsis genes can be used to identify corresponding genes in all other plants, continued progress in identifying Arabidopsis genes should be considered an important strategic component for maintaining the U.S. preeminence in plant biology. Genes identified in Arabidopsis will soon lead to the creation of economically important plants that are more resistant to pathogen attack, that reduce the use of environmentally toxic chemicals, that produce foodstuffs with improved nutritional value, or that yield new kinds of compounds of commercial value. Increasing our knowledge of plant genes has almost limitless potential to improve environmental quality, increase energy production, identify new medicinal compounds, and enhance our ability to respond to the steady increase in human population and changing climatic conditions.

This report contains the recommendations of an ad hoc committee representing the community of Arabidopsis researchers and various government agencies that met in Arlington Virginia on June 8 and 9, 1994, to discuss the feasibility of commencing a federally-funded large scale Arabidopsis genome project in the United States. The committee discussed the impact that an Arabidopsis genome project would have on the progress of basic plant research as well as on the strategic interests of the United States as they relate to agriculture, energy and the environment. The committee concluded that a large scale Arabidopsis thaliana Genome Project (ATGP) should commence as soon as possible. The committee identified essential features that should be considered in any proposals for the initiation of an ATGP. The committee concluded that one or a limited number of linked Arabidopsis Genome Centers should be established and that these Centers will serve as important models for other plant genome projects in the future. Finally, the Committee recommended that the United States ATGP be coordinated with a similar effort already underway under the auspices of the European Community. The committee recommended that funds be provided for:

  1. Completion of the Arabidopsis physical/genetic map and the creation of sequence-ready clone collections by 1997.
  2. Pilot sequencing and technology development projects with the goal of completing 10 megabases of Arabidopsis genome sequence by 1999.
  3. Subsequent scale-up of pilot projects and complete sequencing of the 100 megabase Arabidopsis genome by 2004.


The general benefits of genome sequencing are increasingly obvious as rapid progress is made toward the goal of sequencing complete chromosomes in other model organisms, such as yeast and Caenorhabditis elegans (a small nematode). While classical mutagenesis, genetic analysis and conventional cloning strategies have uncovered many genes, rough estimates suggest that no more than 20-25% of an organism's genes can be identified by classical genetic techniques, even in organisms with a small fraction of redundant genes. Plants, including Arabidopsis, generally exhibit a moderate to considerable redundancy of half or more of their genes. For this and other reasons, mutations that interfere with or eliminate expression of many genes are silent. Thus direct genome sequencing is the only sure way of identifying all of an organisms genes. For plants, it follows that genome sequencing will be required for the identification of most of the economically important genes.

Because genome sequencing projects are still relatively expensive, model organisms have been selected as the initial targets of complete genome sequencing. The evolutionary kinships among organisms justify this approach. Depending on the function of a gene and how well conserved its sequence in evolution, at the very least the gene sequences of the model organism can be used to identify corresponding genes in related species. Thus the selection of model organisms for full genome sequencing is a reasonable policy for conserving limited resources, while maximizing information yield. Model organisms have been chosen by several criteria, including the breadth of existing genetic information, small genome size, and high gene density. Arabidopsis thaliana was adopted as a model organism by plant geneticists some years ago because of its small genome size and rapid reproductive cycle. At 100 megabases, the Arabidopsis genome is among the smallest known plant genomes. It also has a low repetitive DNA content.

The wisdom of selecting Arabidopsis as a model organism for higher plants is becoming increasingly obvious. Initial sequencing efforts suggest that the Arabidopsis genome has a very high gene density (~ 1 gene every 5 kb). The relatively close relationship among higher plants due to the fact that they evolved relatively recently in evolutionary time makes it possible to use sequence information obtained from Arabidopsis to identify homologous genes in other plants, including agronomically important species with much larger genome sizes, higher gene redundancy and a substantially greater content of repetitive sequences. Arabidopsis genes, which are often much easier to clone initially than the corresponding genes of plants with larger genomes, have already been used to identify and manipulate genes in agronomically important species. Scientists at Dupont, for example, have used Arabidopsis genes as probes to clone fatty acid desaturase genes from a variety of oilseed species such as soybean and canola. The cloned genes have been modified and reintroduced into the species of origin to alter the composition of the oil for improved health benefits. Arabidopsis has also been the initial experimental organism for the introduction of bacterial genes that permit genetically engineered plants to synthesize a biodegradable thermoplastic, polyhydroxybutyrate. The gene system was subsequently transferred to plants that can be used to produce the plastic on an agricultural scale. It was the ready availability of Arabidopsis mutants, as well as the fact that Arabidopsis can be genetically manipulated that made this work possible. Additional genes which have been cloned from Arabidopsis and which have potential agronomic value include genes that confer resistance to bacterial and fungal pathogens, which are involved in the synthesis of plant hormones, which affect nutritional quality of seeds, and which alter time of flowering.

In addition to its potential agronomic importance, Arabidopsis genome mapping and sequencing work has already benefited and will increasingly benefit the community of Arabidopsis researchers. It presently takes about three person-years on average to clone an Arabidopsis gene identified by a mutation using map-based cloning techniques. The availability of the complete genomic sequence would vastly simplify and reduce the cost of identifying most Arabidopsis genes. Although the short-term cost of sequencing the entire Arabidopsis genome is substantial (current costs are about $1.00/base, implying a total cost approaching $100 million by project completion), there are long-term savings and benefits for the entire plant research community in accelerating research. Moreover, the high gene density of the Arabidopsis genome implies a high ratio of informative to uninformative sequence, maximizing the return on the investment of time and resources. Finally, the information obtained in sequencing the genomes of other model organisms widely used in biological research, such as Escherichia coli, yeast, and C. elegans has contributed greatly to our understanding of the biology of these organisms and clearly demonstrates the important role that genome projects can play in biological research. Equally significant advances in our understanding of plant biology can be expected from an Arabidopsis genome project.

Workshop Summary

Overview: To assess the feasibility and desirability of a federally funded Arabidopsis genome project, the North American Arabidopsis Steering Committee organized and convened a workshop in Arlington Virginia on June 8-9, 1994. The workshop participants included the elected members of the North American Arabidopsis Steering Committee. Representatives from the National Science Foundation, the U.S. Department of Agriculture, the Department of Energy, the NIH-sponsored human genome project, and the European Community were present as observers. Two scientists involved with the human genome project were also present as technical advisors. A list of participants is given below.

The general goal of the workshop was to assess progress toward meeting the goals of mapping and sequencing the Arabidopsis genome and make specific recommendations to the National Science Foundation to direct future US efforts in the Multinational Coordinated Arabidopsis thaliana Genome Research Project. A secondary goal was to outline in general terms the main issues which should be addressed in future proposals concerning the development of new or expanded Arabidopsis sequencing centers.

The workshop commenced with a summary of the recent Arabidopsis genome conference held at the Cold Spring Harbor's Banbury Center and discussion of current funding for Arabidopsis genome research. Mike Bevan and Chris Somerville presented overviews of the EC sequencing program (ESSA or European Scientists Sequencing Arabidopsis) and the Michigan State University cDNA sequencing project, respectively. Mary Clutter joined the workshop participants for a brief discussion of Arabidopsis genome research funding within NSF. Jen-i Mao and Mark Johnston discussed two different approaches to sequencing taken at Collaborative Research (the multiplex approach) and by the C. elegans sequencing group at Washington University (sequencing machines). The committee discussed the responses from the Arabidopsis community to a questionnaire on the Arabidopsis genome project. Finally, workshop participants discussed the present status and future of the US Arabidopsis genome project, commencing with a detailed consideration of the rationale for genome mapping and sequencing and commentary on the benefits of even the limited effort to date. The following issues were discussed in depth: Should there be an organized Arabidopsis genome project given the current state of Arabidopsis research? What is the relative priority of complete genome sequencing compared to completion of a physical map, adding more PCR-based mapping markers to the map, or single-pass cDNA sequencing? Who should pay for an Arabidopsis genome project, how should it be organized, how long will it take, and how much will it cost? How will a US-funded ATGP be coordinated with ESSA?

Setting Priorities: Before the workshop, a questionnaire designed to obtain feedback from the Arabidopsis community on the desirability of an Arabidopsis genome project was posted on the Arabidopsis electronic newsgroup, More than 20 responses were obtained which were reviewed and discussed during the course of the workshop. Although most respondents supported the concept of an ATGP, several respondents suggested that a high-density genetic map consisting of PCR-based markers be completed before large scale sequencing be undertaken. Indeed, the relatively small number of DNA markers and the incomplete physical map had already been useful to many investigators and that there had been extremely heavy and immediate demand for the cDNA clones that were being sequenced at MSU and in France. The workshop participants agreed that a high-density genetic/physical map would be of immediate benefit to the community. On the other hand, because it takes considerable time to get a sequencing organization equipped, trained and functioning efficiently, there was general agreement of workshop participants that it is essential to begin setting genome sequencing goals immediately and to initiate pilot sequencing projects in parallel with other aspects of genome analysis.

Progress in Genome Research: The current efforts in several laboratories to establish links between the genetic and physical maps of the Arabidopsis genome greatly facilitates the map-based cloning of genes. While many mutations and genes have been mapped by the use of restriction fragment length polymorphism (RFLPs), genetic markers based on the polymerase chain reaction (PCR) are being developed for Arabidopsis. Cleaved amplified polymorphic sequences (CAPS) and simple sequence length polymorphism (SSLPs) markers can be used for rapid mapping of plant mutations and as a dense set of sequence tagged sites (STSs) for the construction of a physical map of the Arabidopsis genome using an anchoring strategy. In a collaborative effort, investigators at the John Innes Institute, the University of Pennsylvania and Massachusetts General Hospital, are developing an overlapping set of YACs covering the entire genome. Using newly available YAC libraries, total genome coverage in YACs is now estimated to be approximately 60-70%; with even greater coverage on chromosome 4 (about 80%).Furthermore, in preparation for phase one of the European Scientists Sequencing Arabidopsis (ESSA), restriction mapping of 500 kb of cosmids from the top of chromosome 4 has been completed and distributed to the participating laboratories. In addition to facilitating the cloning of genes identified solely by phenotype, physical mapping of the genome generates the starting materials for rapid and efficient sequencing and is a key component of a genome project.

Another important component of the ATGP is three cDNA sequencing projects that are underway in Europe, Canada and the US. The European goal is to sequence (from both ends) 3000 unique cDNA fragments(expressed sequence tags or ESTs). ESSA scientists are also mapping their ESTs to YAC clones, regardless of whether the YAC clone has been anchored. Canadian scientists are planning to map 600 ESTs. The US project has already entered 2500 ESTs in publicly available databases and is on the verge of entering an additional 4000 (these have been sequenced only in one direction and relatively little effort has been devoted to eliminating redundancy). The exact number of different gene transcripts represented among this collection of ESTs is unknown; hence the fraction of the estimated 1516,000 Arabidopsis genes represented in this collection cannot be determined at present. The workshop participants concluded that mapping cDNAs had merit because it facilitates connecting a mapped mutation to its cognate gene even in the absence of genomic sequence.

Goals for Arabidopsis Genome Research: Workshop participants agreed that a pilot genome sequencing project should begin immediately. More specifically, the NAASC recommends that a specific federal program be developed to support Arabidopsis genome sequencing and associated technology development with the goal of completion of the entire genomic sequence by the year 2004. The following steps should be undertaken to achieve this goal:

  1. A call for proposals to conduct pilot Arabidopsis sequencing projects. This should be in the form of RFPs to make it possible to attract proposals from outside the Arabidopsis community.
  2. Establishment of several sequencing centers with the short-term goal to obtain 10 megabases of genomic sequence within 3 years from the start of funding (a similar goal to ESSA). The participation of existing DNA sequencing centers, as well as companies with relevant expertise, is encouraged. The purpose of these pilot projects will be to establish the feasibility of and to develop a detailed strategy to complete the sequencing of the entire Arabidopsis genome. To achieve cost-effectiveness, it is not envisioned that this program will fund a large number of small-scale sequencing projects. Pilot sequencing projects should include substantial mapping components, including the goal of finding and mapping at least 1000 PCR-based markers, to generate the appropriate templates for sequencing.
  3. Significant expansion of the pilot sequencing centers to achieve the goal of completion of the entire sequence by 2004. It is noted that this phase will require a substantial commitment of equipment, supplies, and personnel.
  4. Although specific goals were not set, workshop participants emphasized that a key feature of genome research was the development of methods for the identification of gene function. Some of the more promising methodologies for Arabidopsis include antisense mRNA constructs, co-suppression and transposon tagging.

Conference Participants

Dr. Fred Ausubel
Department of Molecular Biology
Massachusetts General
Boston, MA 02114

Dr. Mike Bevan
John Innes Institute
Colney Lane, Norwich
NR4 7UJ, United Kingdom

Dr. Joanne Chory
Plant Biology Laboratory
Salk Institute
PO Box 85800
San Diego, CA 92186-5800

Dr. Joseph Ecker
Department of Biology
University of Pennsylvania
Philadelphia, PA 19104-6018

Dr. Mark Estelle
Department of Biology
Indiana University
Bloomington, IN 47405

Dr. Nina Fedoroff
Department of Embryology
Carnegie Institution of Washington
115 West University Parkway
Baltimore, MD 21210

Dr. Howard Goodman
Department of Molecular Biology
Massachusetts General Hospital
Boston, MA 02114

Dr. Mark Johnston
Department of Genetics
Washington University School of Medicine
4566 Scott Avenue
St. Louis, MO 63110-1031

Dr. Jen-i Mao
Collaborative Research Inc.
1365 Main Street
Waltham, MA 02154

Dr. David Meinke
Department of Botany
Oklahoma State University
Stillwater, OK 74078

Dr. Chris Somerville
Plant Biology Department
Carnegie Institution of Washington
Stanford, CA 94305-4170


Dr. Machi Dilworth
National Science Foundation
4201 Wilson Blvd, Rm #685
Arlington, VA 22230

Dr. Ed Kalaikau
Program Director
Plant Genome Program
National Research Initiative Competitive Grants
AG Box 2241
Washington, DC 20250-2241

Dr. Jerome P. Miksche
Office of Plant Genome Research
Agricultural Research Service
U.S. Department of Agriculture
Bldg. 005, BARC-West,
Beltsville, MD 20705

Dr. Robert Rabson
Division of Energy Biosciences
Office of Basic Energy Sciences
U. S. Department of Energy
ER-17, GTN
Washington, DC 20545

Dr. Robert Strausberg
National Center for Human Genome Research
National Institutes of Health
9000 Rockville Pike
Bethesda, MD 20892

Chromosome I			Chromosome II	Chromosome III
(1)	(2)	(1)	(2)	(1)	(2)	(1)	(2)	(1)	(2)
0.0	RS10	66.1	mi342   0.0	NOR2	0.0	Athb3	112.2	m424
2.4	nga59	68.5	mi133	7.2	ve012	2.5	mi199	113.0	agp29
2.4	PVV4	68.5	GAPB	9.3	mi320	3.5	mi74b	114.8	nga6
2.4	ACC2	70.9	mi72	10.8	m246	3.5	nga32	117.1	nga112
2.4	ve001	70.9	GRF2	13.4	m497A	4.5	nga172
6.2	ATEAT1	72.9	GRF4	14.3	g4553	6.0	GAPC
7.3	agp16	80.2	GRF1	17.9	mi310	6.0	mi172
8.5	O846A	85.3	mi291a	17.9	ve013	9.6	mi355
9.7	SEP4E	86.4	mi441	18.5	mi444	9.6	mi403
9.7	ve002	86.9	mi208	18.5	mi421	10.7	m583
10.3	mi372	87.4	mi106	19.1	g4532	11.4	g4523
10.3	g4715a	89.0	m213	21.8	SEP2A	12.9	nga126
10.3	m488	91.7	spl5	22.9	g4133	13.6	JGB3
10.3	ve003	94.4	mi209	25.9	PR1	14.9	mi467
10.3	agp102	94.6	nga280	28.5	mi398	15.5	mi357
10.3	ve004	94.8	nga128	32.6	m216	16.5	apx2
12.1	PAI1	95.0	mi303	32.6	B33	19.0	mi207
13.9	apx1	96.6	B34	34.3	mi139	19.3	MS2
15.5	mi100	98.2	g4026	35.4	mi148	19.6	ATHCHIB
16.0	m241A	98.7	mi259	38.1	mi238	19.6	spl6
16.2	mi443	98.7	mi230	38.4	pGC1	20.8	nga162
16.6	nga63	98.7	mi408	38.7	m251	24.9	g4708
17.2	ve005	98.7	mi304	38.7	O8O2F	24.9	m228
20.2	ve006	100.0	ATHGENE	45.1	g6842	25.4	mi289
20.9	NCC1	102.7	mi324	45.1	GPA1	25.9	mi339
20.9	pC1	102.7	mi353	48.0	er	30.8	m560B2
21.6	pC2	102.7	MW1	48.0	B68	33.2	m105
23.0	O818	104.5	mi424	54.1	pGC2	33.9	mi142
23.5	EG17G9	108.6	m315	55.1	mi54	34.5	O4O23
24.5	g3786	114.1	mi185	56.1	ve014	36.4	mi268
25.5	ve007	114.1	mi193	56.6	m220	41.8	mi386
28.1	mi348	116.2	g4552	58.9	SEP5B	42.9	mi225
29.1	mi113	116.7	pCITf11758.9	m283C	43.9	g4711
31.5    pFNR	122.1	Athb13	63.3	PR21	50.1	mi178
33.9	mi203	123.8	Tag1	65.2	ve015	50.1	mi287
33.9	pC3	124.8	mi462	65.2	spl3	51.1	ve020
35.3	g3829	125.8	mi103	65.7	mi277	51.1	um579C
39.2	m235	126.3	PKNAT22	66.0	ve016	52.6	GAPA
42.0	mi265	126.3	m453A	66.3	g17288	57.3	GL1
42.5	mi163	129.4	nga111	66.3	m323	60.4	mi413
45.1	ve008	131.7	mi425	68.0	um579B	63.6	mi79b
45.1	mi111	131.7	PR5	68.0	ve017	64.1	mi358
45.1	mi62	133.2	I6	68.3	SAM3	69.1	ve021
45.1	mi15	134.0	ATHATPAS68.6	ve018	72.1	g4117
45.1	mi192	134.6	m532	70.4	LTP	72.1	m249
46.1	mi116	135.0	ADH	71.4	m429	77.6	O97B1
46.1	nga248	135.4	petE	71.4	g4514	84.7	I18
56.2	ve009	135.8	ve011	72.9	O5841	86.1	g4564b
56.8	m254A	136.9	agp64	72.9	nga168	89.2	g4014
57.9	m253	140.1	g17311	78.4	m336	93.5	m457
58.5	P39B2T7	141.1	m132A	81.0	ve019	94.6	mi456
59.1	ve010	141.1	mi157	81.6	mi473	99.7	t94
60.8	mi423a	150.4	pAtT32CX83.7	Athb7	101.8	ve022
62.5	RPS18B			86.4	mi455	104.5	g2778
64.1	mi63			86.4	mi79a	104.5	BGL1
65.1	mi19			88.8	pAtT51	108.1	spl1

Chromosome IV				Chromosome V
(1)	(2)	(1)	(2)	(1)	(2)	(1)	(2)
0.0	BIO217	67.1	m226	0.0	O5629	77.3	mi323
3.7	mi51	67.1	g13683	3.2	um515D	78.4	m247
6.6	mi204	67.1	C18a	3.2	g3715	82.7	g4028
7.2	mi122	67.6	g4539	6.1	ctr1	85.9	DFR
10.4	g3843	68.1	mi112	8.0	mi121	87.0	spl2
13.7	I41G	68.1	mi330	8.0	nga225	88.2	mi194
16.6	mi301	68.6	g3845	9.0	ASA1	88.7	agp6
18.7	ve023	70.8	mi32	10.0	O6569	89.2	mi83
18.7	mi390	73.0	AG	10.5	m217	92.5	mi423b
20.5	GA1	73.0	g19838	11.5	g3837	93.1	mi61
22.4	Gslohp	73.9	pCITd71	13.0	mi97	96.3	ve027
25.3	petC	73.9	g3883	14.6	KG31	99.8	mi271
26.7	m448A	75.5	JGB9	16.5	nga158	100.3	mi226
.7	mi233	79.1	mi422	19.4	nga249	101.6	nga129
27.7	g2616	81.9	mi475	22.2	ca72	106.2	m435
28.2	mi306	83.0	agp66	22.7	nga151	106.3	LFY3
28.7	m506	83.5	SEP2B	22.7	mi174	111.8	m558A
29.5	BIO200	85.5	RPS2	23.3	CHS	112.9	mi70
30.3	mi167	86.5	m600	26.9	mi322	112.9	mi418
30.8	mi87	86.9	PG11	26.9	mi438	112.9	mi74a
30.8	m456A	87.3	mi123	26.9	nga106	112.9	mi184
33.4	nga12	88.4	mi232	28.9	ve026	114.0	mi69
36.1    nga8	88.4	RLK5	29.7	g4560	114.0	SEP5A
39.7	RPS18C	90.6	H1	30.5	TSL	115.3	agp50
40.2	pCITf3	92.8	g8300	32.2	mi138	118.9	m211A
41.7	H2761	93.9	mi431	35.6	mi90	125.4	g2368
41.7	m518A	94.6	O6455	36.1	mi433	126.5	ve028
47.1	BIO206	94.6	g3088	36.1	GslbutAr132.6	m555
49.4	pCITd23	95.4	pCITd10437.8	m291	134.9	mi335
52.6	g41 08	100.4	pCITd76	44.1	g4715b	138.2	BIO205
55.0	mi465	100.4	pCITd99	47.9	g455657
.8	mi128	104.4	m214	48.8	nga139
58.9	g6837	108.8	AP2	50.1	mi219
58.9	g2620	114.7   g2486	50.1	AF3
59.9	um713B1	115.8	um596A	52.6	Tn139
59.9	mi279	119.4	ve025	57.9    mi125
60.4	um713B2	124.4	mi369	62.3	um579D
60.4	mi30	124.4	DHS1	64.0	nga76
60.9    g10086	124.4	g3713	64.6	mi291b
60.9	g4564a			67.5	mi137
61.9	m326			68.5	GRF3
61.9	ve024
62.4	mi198
64.5	mi260
NOTES: RFLP map prepared by Clare Lister and Caroline Dean, John Innes Centre, Norwich, UK. Map was generated using recombinant inbred lines of Arabidopsis thaliana from a cross between Landsberg and Columbia ecotypes. MapMaker software provided the linkage analysis of RFLP scores; recombination frequencies were converted to map distances using the Kosambi mapping function. The maps of chromosomes I, II, III, and V were generated entirely using MapMaker. The map of chromosome IV was based on the physical map data from Renate Schmidt and coworkers. The columns under each chromosome name represent (1) the cumulative centimorgan distance along the chromosome and (2) the corresponding locus.

May 1989 First U.S. planning workshop (National Science Foundation, Washington, DC)

July 1989 Second U.S. planning workshop (Cold Spring Harbor, NY)

October 1989 International planning workshop (Bloomington, IN) UK Agricultural and Food Research Council (AFRC) establishes a coordinated program on Arabidopsis biology under the Plant Molecular Biology (PMB) research initiative

December 1989 Seed stock center at University of Nottingham, UK, is established

February 1990 First issue of the AFRC PMB Arabidopsis newsletter is published

April 1990 The Multinational Science Steering Committee meets in Denver, CO, and drafts a long-range plan for the Multinational Coordinated Arabidopsis thaliana Genome Research Project

May 1990 The European Community (EC) launches a transnational Arabidopsis genome research project a 3-year T-project under Biotechnology Research for Innovation, Development, and Growth in Europe (BRIDGE) aimed at technology development

June 1990 Participants of the Fourth International Conference on Arabidopsis Research in Vienna, Austria, endorse the long-range plan drafted by the steering committee in April In Washington, DC, the Department of Energy, the National Institutes of Health, the National Science Foundation (NSF), and the Department of Agriculture sign an interagency agreement to collaborate on Arabidopsis genome research

July 1990 A worldwide Arabidopsis electronic bulletin board is established

August 1990 Long-range plan for the Multinational Coordinated Arabidopsis thaliana Genome Research Project is published (NSF 90-80)

October 1990 NSF receives funding to begin Arabidopsis genome research initiative

February 1991 A DNA clone center is established in Koln, Germany, as part of the EC BRIDGE project

April 1991 First annual progress report for the Multinational Coordinated Arabidopsis thaliana Genome Research Project is published (NSF 91-60)

September 1991 French cDNA and mapping project begins An Arabidopsis thaliana Database (AAtDB) is established at Massachusetts General Hospital in Boston, MA Arabidopsis Biological Resource Center at Ohio State University and its associated database, Arabidopsis Information Management System (AIMS), are established

January 1992 EC-U.S. workshop on managing Arabidopsis genome data (Boston, MA)

March 1992 U.S. cDNA sequencing project at Michigan State University in East Lansing, MI, is initiated

August 1992 Second annual progress report for the Multinational Coordinated Arabidopsis thaliana Genome Research Project is published (NSF 92-112)

October 1992 Second phase of the AFRC PMB Arabidopsis program begins in UK

November 1992 First set of Arabidopsis expressed sequence tag data is entered into the Database of Expressed Sequence Tags (dbEST), a public database at the U.S. National Center for Biotechnology Information, Bethesda, MD

June 1993 Workshop on database needs for Arabidopsis genome research (Dallas, TX)

August 1993 Fifth International Conference on Arabidopsis research (Columbus, OH) Germany establishes a special research program, titled "Arabidopsis as a Model for the Genetic Analysis of Plant Development"

November 1993 Spain establishes an Arabidopsis research network

December 1993 European Scientists Sequencing Arabidopsis Project begins in EC Third annual progress report for the Multinational Coordinated Arabidopsis thaliana Genome Research Project is published (NSF 93-173) UK Arabidopsis electronic bulletin board is established

January 1994 ARANED, an Arabidopsis research group, is established in The Netherlands

March 1994 Banbary conference on issues related to a large-scale sequencing of Arabidopsis genome (Cold Spring Harbor, NY)

June 1994 Planning workshop for a coordinated Arabidopsis genome sequencing project (Arlington, VA)

July 1994 "End of BRIDGE - Beginning of ESSA" EC workshop (Cambridge, UK)

November 1994 First issue of Weeds World, an electronic worldwide newsletter for the Multinational Coordinated Arabidopsis thaliana Genome Research Project, is published

January 1995 EC announces plans for the Fourth Framework Program (1996-99), which aims to sequence 10 megabase of the genome and set up a systematic function search program for the Arabidopsis genome

The following people contributed information, reviews, or illustrative material to this report. Members of the Multinational Science Steering Committee are indebted to their many colleagues who offered their time and talent.

Mary Anderson, Nottingham University,
Nottingham, UK

Frederick Ausubel, Massachusetts General
Hospital, Boston, MA, USA

Barbara Baker, U.S. Department of Agriculture,
Plant Gene Expression Center, Albany, CA, USA

Philip Benfey, New York University,
New York, NY, USA

Michael Bevan, John Innes Centre, Norwich, UK

Barbara Buchanan, National Agricultural Library,
Beltsville, MD, USA

Sam Cartinhour, U.S. Department of Agriculture,
National Agricultural Library, Beltsville, MD, USA

J. Michael Cherry, Stanford University,
Stanford, CA, USA

Joanne Chory, Salk Institute, La Jolla, CA, USA

George Coupland, John Innes Centre, Norwich, UK

Jeff Dangl, Max Planck Institute, Cologne, Germany

Keith Davis, Ohio State University,
Columbus, OH, USA

Caroline Dean, John Innes Centre, Norwich, UK

Michel Delseny, University of Perpignan,
Perpignan, France

Joseph Ecker, University of Pennsylvania,
Philadelphia, PA, USA

Mark Estelle, Indiana University,
Bloomington, IN, USA

Ellen V. Kearns, Massachusetts Institute of
Technology, Cambridge, MA, USA

Maarten Koornneef, Agricultural University,
Wageningen, The Netherlands

David Kristofferson, Intelligenetics,
Palo Alto, CA, USA

Peggy Lemaux, University of California,
Berkeley, Berkeley, CA, USA

Bertrand Lemieux, York University,
North York, Ontario, Canada

Hong Ma, Cold Spring Harbor Laboratory,
Cold Spring Harbor, NY, USA

Sheila McCormick, U.S. Department of Agriculture,
Plant Gene Expression Center, Albany, CA, USA

Susan McCarthy, U.S. Department of Agriculture,
National Agricultural Library, Beltsville, MD, USA

Jon Monroe, James Madison University,
Harrisonburg, VA, USA

John Morris, Massachusetts General Hospital,
Boston, MA, USA

Hong Gil Nam, POSTECH, Kyungbuk,
Republic of Korea

Thomas Newman, Michigan State University,
East Lansing, MI, USA

Kiyotaka Okada, National Institute of Basic
Biology, Okazaki, Japan

Georges Pelletier, Ministere de l'Agriculture-INRA,
Versailles, France

Sakti Pramanik, Michigan State University,
East Lansing, MI, USA

Daphne Preuss, University of Chicago,
Chicago, IL, USA

Ernie Retzel, University of Minnesota,
Minneapolis, MN, USA

Chris Somerville, Carnegie Institution of
Washington, Stanford, CA, USA

Randy Scholl, Ohio State University,
Columbus, OH, USA

Brian Staskawicz, University of California,
Berkeley, CA, USA

The following individuals served as consultants during the preparation of this report.

Machi Dilworth, National Science Foundation,
Arlington, VA, USA

Etienne Magnien, Commission of the European
Communities, Brussels, Belgium

Jerome Miksche, U.S. Department of Agriculture,
Beltsville, MD, USA

Jim Tavares, U.S. Department of Energy,
Washington, DC, USA

Anton Vassarotti, Commission of the European
Communities, Brussels, Belgium