1. Genome Analysis

    2. Biological Resource Centers

    3. Informatics

    4. Workshops and Symposia

    Arabidopsis as a Teaching Tool

    Isolation of Genes Spurs Disease Resistance

Since the initial long-range research plan was developed in 1990, knowledge about the Arabidopsis genome has increased to the point where a large-scale, systematic sequencing of the entire 100 megabase (Mb) genome is a realistic goal. This section summarizes progress made during the past year on research on the Arabidopsis genome. Ongoing efforts in building research tools and resources are also described.

1. Genome Analysis

Large-scale DNA Sequencing

Last year's report outlined two projects -- the complementary DNA (cDNA) sequencing projects -- which were designed to systematically sequence expressed genes of Arabidopsis. The cDNA projects are producing partial sequences called expressed sequence tags (ESTs). These ESTs are sequences unique to each cDNA, and serve as markers. During the past year, cDNA sequencing efforts have continued to produce useful information and EST markers, both of which are widely used by the Arabidopsis research community. In addition, a new effort -- the European Scientists Sequencing Arabidopsis (ESSA) Project -- is underway to start a systematic sequencing of the entire Arabidopsis genome.

European Scientists Sequencing Arabidopsis Project: Systematic sequencing of the Arabidopsis genome began last year, when a European Union network, funded by the European Community, was set up to sequence -- on a pilot scale -- 2 megabases (Mb) of chromosome IV and 0.5 Mb of other regions of genomic DNA. The ESSA effort also encompasses partial sequencing of 3,000 novel cDNAs, development of a sequence informatics node, and preparation of sequence-ready libraries. The project started up in September 1993, and most labs began work by early 1994.

The largest contiguous region sequenced so far has been 16 kilobases (kb) surrounding the GAP-A gene on chromosome III. In addition to the GAP-A gene, four novel open reading frames and a retrotransposon have been identified, as well as a peculiar AT-rich tract. The density of genes in this area, together with the availability of a means to identify open reading frames, is promising. A density of one gene every 4 to 5 kb is expected; this is close to what was previously predicted. Based on a genome size of 100 Mb, this suggests a total of 20,000 to 25,000 genes. Data are still arriving on the large regions of chromosome IV, as the participating labs have had to learn new methods of large-scale sequencing. Regions of overlap show a high degree of accuracy in the independently sequenced areas. By early 1995, the first year's quota of 350 kb of chromosome IV should be done. An analysis of this region will provide new information on gene density, clustering of gene families, the composition of intergenic DNA, and the sequences of novel plant genes.

The major limiting step in systematic genome sequencing is the provision of sequence-ready libraries: Present cosmid (i.e., small DNA fragments cloned from the genome) coverage accounts for only 80 percent of the regions to be sequenced in the next 2 years. The increased effort being put into yeast artificial chromosome (YAC -- i.e., cloned DNA corresponding to a large fragment of the genome) coverage means that YACs must be the main source of sequence substrates. Consequently, new methods for deriving random libraries from YACs are under study.

For more information, contact Michael Bevan, John Innes Centre, Colney Lane, Norwich, NR4 7UJ, UK; phone: 44-16-03-52571, ext. 2518/2520; fax: 44-16-03-505725; e-mail:

Large-scale cDNA Sequencing -- The French Program: This is the third year of the French cDNA sequencing program. Seven libraries have been used, representing tissue from etiolated seedlings, cell suspensions, green shoots and leaves, flower buds, immature siliques, dry seeds, and wounded leaves.

The project's major change in 1994 was that its initial support from CNRS (Centre National de la Recherche Scientifique) would no longer be available for EST work. Funding has subsequently been taken over by the ESSA program, although GREG (Groupement de Recherches et d'Etudes sur les Genomes) supports sequencing of full length cDNAs. ESSA only provides for sequencing of new clones. Because of the redundancy within and between libraries and efforts by the American consortium, new genes are found less frequently. As a result, different groups have set up screening procedures to increase chances of finding new genes.

With the availability of the new YAC library, more efforts will focus on mapping cDNAs on the Arabidopsis chromosomes. About 70 cDNAs have already been mapped by the Institut National de Recherche Agronomique (INRA) group in Versailles. All together, the French teams have now released about 4,100 ESTs -- to the European Molecular Biology Laboratory database and the Database of Expressed Sequence Tags (dbEST) -- out of the 6,000 that have been sequenced. Also, most of these clones have been sent to the Ohio State University Arabidopsis Biological Resource Center (ABRC) for distribution. This represents a deposit of 2,500 ESTs for 1994, of which 1,700 correspond to 650 new genes with 5' and 3' tags. So far, ABRC has distributed 460 clones to 170 people, in addition to those directly distributed by French groups. Finally, four teams are participating in the ESSA genomic sequencing effort: They have determined 74 kb around four different loci.

For more information, contact Michel Delseny, URA 565 CNRS, University of Perpignan, Perpignan 66860, France; phone: 33-68-662119; fax: 33-68-668499; e-mail:

Large-scale cDNA Sequencing -- The Michigan State University (MSU) Program: The goal of the MSU Arabidopsis cDNA sequencing project is to produce 36,000 ESTs in order to identify more than 80 percent of the genes expressed by this organism. The cDNA library being used, PRL2, is composed of cDNAs generated from equal quantities of four pools of messenger RNA (mRNA). These four mRNA sources were 7-day-old etiolated seedlings; roots grown in tissue culture; rosettes from plants (staged weekly), half with a 24-hour light cycle and half on a 16-hour light/8-hour dark cycle; and aerial tissue (stems, flowers, and siliques) from the staged plants. The Ziplox vector was used for directional insertion of the oligo-dT primed mRNA. Until normalization of this library is achieved, it is screened to eliminate the most redundant clones.

The project was initiated in 1992 with funding from the U.S. Department of Energy and the State of Michigan; in late 1993, the National Science Foundation (NSF) granted project funding for 3 years. Since February 1994, technical personnel, working with two ABI 373A automated fluorescent sequencers and an ABI catalyst 800 molecular biology workstation, have produced 7,400 quality sequences, or an average of 925 sequences per month. The average edited sequence is more than 350 b in length.

The project's biocomputing group, based in Minneapolis, Minnesota, edits and analyzes the sequences. The group has programs that analyze the quality of the sequence and trim off vector and 3 low-quality regions. The edited sequence is then formatted and deposited in the dbEST at the National Center for Biotechnology Information (NCBI). Data from the project, including fully tabulated BLASTX and BLASTN analyses on the clones, are available through the World Wide Web (WWW; see below for contact information).

Comparing several ESTs to previously sequenced clones indicates that over half the cDNA clones are essentially full length, i.e., they encode the translational start site. More than one-third of the ESTs have significant homology to known genes. About 10 percent of the ESTs that show similarity to genes have not yet been identified in any plant species.

The biological materials from this project include cDNA clones and the PRL2 library. More than 1,500 cDNA clones and 125 aliquots of the PRL2 library have been sent to laboratories worldwide.

For information about the MSU project, contact Thomas Newman, MSU-DOE Plant Research Laboratory, Michigan State University, East Lansing, MI 48824-1312, USA; phone: 517-353-0854; fax: 517-353-9168; e-mail: Project data are available at and; boolean keyword searches can be performed on the data using XMOSAIC. The biological materials from this project can be obtained from ABRC.

Genome Mapping

Maps provide valuable reference points for research in molecular genetics. Recently, great efforts have been made to expand two types of maps for genetic information in Arabidopsis a genetic map, which plots the estimated arrangement of genes on each chromosome; and a physical map, which determines the actual distances between markers on a chromosome in terms of kilobases. Completion of the high-density physical map and integration of the genetic and physical maps are two goals of the Multinational Coordinated Arabidopsis thaliana Genome Research Project.

Mapping Mutants: Significant advances were made in 1994 in mapping the chromosomal locations of mutant genes. The most recent map of genes identified by mutation, which was compiled by Maarten Koornneef (Wageningen, The Netherlands) and David Meinke (Stillwater, OK), includes more than 280 visible markers distributed over five chromosomes. This represents more than twice the number of mutant genes included on the genetic map just 2 years ago. Embryo-defective mutants, isolated and characterized by David Meinke and colleagues, represent the largest collection of new visible markers added to the genetic map. With recent advances in mapping procedures - particularly the wide distribution of cleaved amplified polymorphism sequence markers introduced by Frederick Ausubel (Boston, MA) and simple sequence length polymorphism markers developed by Joseph Ecker (Philadelphia, PA), it is likely that another 50 to 75 mutant genes of widely different types will be added to the genetic map this year.

Physical Maps: The number of DNA-based markers mapped continued to increase throughout 1994; the total is currently about 379. (See app. B for the latest map.) As new markers became available, they were hybridized to the YAC libraries to increase YAC coverage of each chromosome.

The biggest advances in the linking of YAC contigs were achieved once the YAC library -- prepared in a collaborative effort based in France by the Center for the Study of Human Polymorphisms (CEPH), INRA, and CNRS (a collaboration known as CIC) -- was distributed in May 1994. This library consists of 1,152 clones with an average insert size of 450 kb; it contains very few chimeric YAC clones.

The library is being used by the Joseph Ecker (Philadelphia, PA) and Howard Goodman (Boston, MA) labs to add to the physical maps of chromosomes I, II, and III; and by the Caroline Dean lab (Norwich, UK), for chromosomes IV and V. The combination of the CIC library, increased marker coverage, and a few successful walking experiments has resulted in chromosome IV being covered by just 10 contigs (R. Schmidt, J. West, and C. Dean, all of Norwich). Chromosome V is currently covered by 35 to 40 contigs (R. Schmidt, K. Love, Z. Lenehan, and C. Dean, all of Norwich), after having used over 100 markers on four YAC libraries. Chromosome II (H. Goodman lab) is covered by about 40 contigs, for a total distance of about 17 Mb. The largest contig on chromosome II -- constructed in part using YAC end probes -- is about 3.2 Mb, covering about 5.7 centimorgans (cM). A similar level of coverage has been achieved for chromosome I (J. Ecker lab).

Cosmid contigs from the Howard Goodman laboratory and EST clones are also being integrated into the chromosome II, IV, and V YAC contigs. Efforts are continuing toward generating a 1.5 Mb cosmid contig (I. Bancroft, K. Love, and C. Dean, all of Norwich; C. Cobbett, Melbourne; and H. Goodman), covering the region of chromosome IV which is being sequenced as part of the European Community's ESSA program. Currently, nine cosmid contigs, covering 730 kb, have been restriction mapped and distributed to participating laboratories. Joining these contigs will require new strategies, since sequences within the gaps are repetitive or underrepresented in the cosmid libraries being screened.

Gene Identification

Significant advances have been made in the cloning and molecular characterization of genes originally identified by mutation, and the isolation of new mutants with informative phenotypes. Two conclusions can be drawn from this work:

Since mid-1993, impressive progress has been made in cloning genes identified by mutation. Published examples include genes involved in hormone perception and response (ABI1, AXR1); flowering (LD, PI, AP2, TSL, MS2, CAL, and SUP); vegetative development (GL2, FEY, PFL, and PAC); basic metabolism (FAD2); resistance to plant pathogens (RPS2); perception of light (HY4); essential cellular functions (EMB30); and transduction of environmental and developmental signals (DET1, FUS6, and COP9). Just 10 years ago, this type of progress would have been considered impossible.

Some of the mutant genes are related in sequence to important regulatory genes known from different organisms. Others represent novel sequences which may provide new insights into eukaryotic cell function. Included in this collection are several genes cloned by chromosome walking, one gene cloned by transposon tagging, and a large number of genes cloned by transferred DNA (T-DNA) insertional mutagenesis. Continued availability of a large collection of T-DNA tagged lines, and ongoing advances with transposon tagging and chromosome walking, should lead to further growth in the number of mutant genes cloned in 1995.

Considerable progress has also been made in expanding existing collections of mutants. For example, Chris Somerville (Stanford, CA) and his colleagues reported the identification of about 40 novel mutants with altered cell wall polysaccharide composition, which were isolated by screening 5,000 mutagenized lines by gas chromatography of sugar derivatives. These mutants should permit a new approach to the difficult problems associated with understanding cell wall biosynthesis and function.

fied by screening for mutants with phenotypes similar to well-established mutants. A good example is the identification by Ruth Finkelstein (Santa Barbara, CA) of several additional ABI loci involved in mediating responses to abscisic acid. Another example is the discovery of additional members of the leafy cotyledon class of mutants by Peter McCourt (Toronto, Canada); Helmut Baumlein (Gatersleben, Germany); John Harada (Davis, CA); and David Meinke (Stillwater, OK). These few examples underscore the conclusion that further screening of mutagenized populations is likely to yield additional examples of new genes with functions similar to known genes.

More detailed analysis of existing mutants has also resulted in interesting overlaps between phenotypic classes. For example, studies by John Schiefelbein (Ann Arbor, MI) and colleagues have recently shown that the ttg mutant -- known for many years as being defective only in trichome formation and seed coat pigmentation -- also exhibits interesting defects in the spatial distribution of root hairs. Recent efforts in other laboratories have shown that several fusca mutants, first identified by Andreas Muller (Gatersleben, Germany) based on inappropriate accumulation of anthocyanins during embryogenesis, are identical to several de-etiolated and constitutive photomorphogenic mutants. Joanne Chory (La Jolla, CA) and Xing-Wang Deng (New Haven, CT) showed that these latter mutants exhibit interesting defects in photomorphogenesis following germination.

2. Biological Resource Centers

The Arabidopsis resource centers were established in 1991 to preserve and distribute biological materials supporting the Arabidopsis genome research project. These centers also disseminate genome-related information to the large Arabidopsis research community. The three stock centers -- the Arabidopsis Biological Resource Center (ABRC) at Ohio State University in Columbus, Ohio; the Nottingham Arabidopsis Stock Centre (NASC) at the University of Nottingham, United Kingdom; and the European DNA Resource Center at the Max Planck Institute for Genetic Research in Cologne, Germany -- share these duties. The dramatic expansion of Arabidopsis research over the last 5 years is demonstrated by the number of stocks the centers hold and distribute, and the worldwide dispersion of researchers they service.

The resource centers have seeds and clones that are useful for research, especially exploration of the genome. For example, the large seed collections of Albert Kranz (Frankfurt, Germany) and George Redei (Columbia, MO) -- accumulated through their long careers -- have been incorporated into the stock centers. Together, these comprise 1,000 stocks, including many important mutant and wild-type lines collected from all over the world. The mapping collection and mutants of Maarten Koornneef (Wageningen, The Netherlands) are also available. In addition, new mutants affecting development and metabolism are being generated in many Arabidopsis laboratories, and are being shared through the three centers. About 300 such lines are now distributed, and 70 new donations -- received in response to a recent campaign -- were recently made available.

T-DNA lines and transposable element-transformed lines are useful tools for cloning genes with identifiable phenotypes. More than 5,000 T-DNA lines are being distributed by the centers; 1,600 new lines have been received from Kenneth Feldmann (Tucson, AZ). Moreover, about 200 transposon lines are held. The stock centers also have about 100 promoter trap lines, a new resource for isolating genes identified through their expression pattern. More donations of these useful stocks are expected soon.

Recombinant inbred populations have become very useful for genetic mapping. Two recombinant inbred populations, consisting of a total of 450 lines, are held by the stock centers. Trisomic stocks, lines transformed with specific genes, and representatives of related species are also held.

The DNA resources distributed by the stock centers include about 300 restriction fragment length polymorphism (RFLP) mapping clones, four YAC libraries, 50 individual clones, 6,000 ESTs, cDNA libraries, genomic libraries, and filters generated from the YAC libraries suitable for probing. In response to a recent campaign by the Ohio center for donations, a number of cloned genes, new RFLP clones, and cDNA and genomic libraries were deposited. Also newly received are the Arabidopsis RFLP Marker Set donated by T. Schaeffner (Munich, Germany); a cosmid library containing T-DNA transforming sequences from Kenneth Feldmann; and a hybrid library from John Walker (Columbia, MO). Also, Robert Whittier (Tsukuba, Japan) has agreed to donate his Pl bacteriophage library.

Stocks from the resource centers are distributed worldwide. The number of stocks sent have increased significantly in the last 3 years, going from 15,000 total seed stock distributed in 1992 by ABRC and NASC combined, to about 45,000 seed stocks distributed in 1994. As for DNA, l,000 clones and six YAC libraries were sent in 1991; just 2 years later, about 3,100 clones and 166 libraries were sent. Distribution of ESTs was started in late 1993: About 1,300 ESTs have been sent out since. Increasing numbers of mutants and new batches of T-DNA lines will be donated to the centers in the near future. Also, clones representing the complete physical map and new ESTs are expected.

Ordering and Contact Information

ABRC: Randy Scholl, Arabidopsis Biological Resource Center, Ohio State University, Columbus, OH 43210, USA; phone: 614-292-9371; fax: 614-292-0603; e-mail:; Arabidopsis Information Management Systems (AIMS) WWW server URL

NASC: Mary Anderson, Nottingham Arabidopsis Stock Centre, Department of Life Science, Nottingham University, Nottingham NG7 2RD, UK; phone: 44-1159-791216; fax: 44-1159-513251; e-mail:; NASC WWW server URL

NASC stock information is distributed through a hard copy seed list, an Arabidopsis thaliana database (AAtDB), AIMS, and the AAtDB Research Companion gopher server.

European DNA Resource Center: Jeff Dangl, European DNA Resource Center, Max Planck Institute, Carl von Linne-Weg 10, Cologne D-50829, Germany; phone: 49-221-5062-630; fax: 49-221-5062-613; e-mail:

Note that service at this stock center was discontinued as of December 31, 1994.

3. Informatics


An Arabidopsis thaliana Database: AAtDB continues to be a key resource for sharing genetic map information. Its data are presented in graphic, tabular, and text formats. Information in AAtDB is provided by the Arabidopsis community, either directly from investigators or from publicly available collections and databases.

To use the system: Distributed versions of AAtDB need a UNIX workstation running the X-windows display or a Macintosh workstation. AAtDB is available over the Internet without charge via anonymous file transfer protocol (FTP) from in the AAtDB directory.

The UNIX-based ACeDB software and its C source code files are available via anonymous FTP from A WWW version is available through the server at the National Agricultural Library,

AAtDB Research Companion: The AAtDB Research Companion provides Internet access to AAtDB. The Companion is a computer source ( that serves information to Internet users through WWW, gopher, Telnet, and FTP. A link is also provided to the gopher server, which offers all information in AAtDB in text. In addition, the gopher server includes:

For additional information, contact John Morris, Curator, AAtDB project, Department of Molecular Biology, Massachusetts General Hospital, Boston, MA 02113, USA; fax: 617-726-6893; e-mail:

Arabidopsis Information Management System: AIMS is an on-line database system running on a central machine at Michigan State University. It was originally developed to support data management, including stock ordering and inventory, at Ohio State University's ABRC. The database is implemented on top of a commercially available Sybase system. All AIMS graphics features are offered in an object-oriented fashion using X-windows. Its nongraphics features can also be accessed on a microcomputer or VT100-type terminal. Mosaic interface to AIMS is also available; this provides access to AIMS data and stock ordering.

Features: AIMS manages both data and programs such as MapMaker; it includes a general mechanism to input private data and manage output. Data and features now available or to be added soon include:

Stocks can be ordered through on-line AIMS and the Mosaic AIMS interface. AIMS keeps track of all on-line and e-mail stock orders, which can be accessed.

To use AIMS: Telnet to this Internet address: (Telnet or X-windows) or (Mosaic URL).

For more information, contact Sakti Pramanik, Computer Science Department, Michigan State University, East Lansing, MI 48824, USA; phone: 517-353-3177; fax: 517-432-1061; e-mail:

Database of Expressed Sequence Tags: Maintained at the National Center for Biotechnology Information as part of the National Library of Medicine at the National Institutes of Health, Bethesda, Maryland, dbEST carries detailed descriptions of sequences, including putative homology assignments using the Basic Local Alignment Research Tool (BLAST) set of programs. Also, the database has information on contributors, available genetic map locations, and instructions on where to obtain physical DNA clones. The latest release of dbEST as of this writing is version 2.43, which carries over 67,000 entries. Arabidopsis is the third largest group of entries.

Arabidopsis EST entries from NCBI are also held on the AAtDB Research Companion, gopher server at Massachusetts General Hospital, Boston, Massachusetts. The EST report files are WAIS indexed to allow rapid searching using keywords.

For more information about dbEST, contact Mark Boguski, National Center for Biotechnology Information, Building 38A, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA; phone: 301-496-1475; fax: 301-480-9241; e-mail:; WWW URL:


BioSci Arabidopsis Genome Electronic Conference: The Arabidopsis newsgroup, part of the BioSci/BIONET electronic newsgroup network, is an international collaboration via computer networks. Although BioSci runs over 60 newsgroups on various biology topics, the Arabidopsis group is the best example of Internet use by a community of international scientists working toward a common goal.

The Arabidopsis newsgroup is distributed worldwide through both USENET news, under the name bionet.genome.arabidopsis, and through e-mail. If USENET news access is not available on a personal computer, e-mail subscriptions can be requested by contacting one of the following addresses, based upon location:

Subscription address
Location       Europe, Africa and Central
Asia           Americas and the Pacific Rim
As of November 1994, there were 878 e-mail subscribers, up 21 percent from the previous year. A complete archive of all Arabidopsis postings is maintained for anonymous FTP and gopher retrieval on the Internet computer in the directory pub/biosci/arabidopsis. WWW users can connect using the URL gopher:// and look in the Arabidopsis folder.

Arabidopsis postings are also indexed in the general biosci.src WAIS source on the computer WAIS software indexes all text in every BioSci newsgroup posting and allows Internet users to search for any text string and then retrieve messages bearing the specified text. The WAIS indexes can be queried using either the gopher software or a WWW browser such as Mosaic, as instructed above. In either case, the option to pick is listed as "Search Bionet USENET Articles." The WAIS source can also be queried by e-mail. For WAISMAIL instructions, send the word "help" in the body of an e-mail message (leave the subject line blank) to For help using the archives, contact

Weeds World, The International Electronic Arabidopsis Newsletter: The Multinational Science Steering Committee endorsed, at its meeting in Amsterdam in June 1994, the production of an electronic newsletter. Shortly thereafter, Weeds World was launched. It is a popular forum for the exchange of information and is used much like the Worm Breeders' Gazette which serves the Caenorhabditis elegans community. Weeds World is published three times a year and distributed through WWW, indexed and carried on the AAtDB Research Comp the Plant Molecular Biology Reporter, courtesy of the International Society of Plant Molecular Biology. There will be no regular hard copies made of the newsletter.

The newsletter is produced by Mary Anderson (Nottingham, UK); Sam Cartinhour (Beltsville, MD); and Randy Scholl (Columbus, OH). John Morris (Boston, MA) archives and indexes it on the AAtDB Research Companion. The first edition was published in November 1994.

The URL addresses are:





Several recent books are devoted to various aspects of the biology of Arabidopsis:

John Bowman, ed. Arabidopsis: An Atlas of Morphology and Development. New York: Springer-Verlag, 1994. 450 pp. ISBN 0-387-94089.

Csaba Koncz, Nam-Hai Chua, and Jeff Schell, eds. Methods in Arabidopsis Research. Singapore: World Scientific, 1992. 482 pp. ISBN 98l-02-904-5 (hardback); 98l-02-905-3 (paper).

Elliot Meyerowitz and Chris Somerville, eds. Arabidopsis. Cold Spring Harbor, NY: Cold Spring Harbor Press, 1994. 1,300 pp. ISBN 0-87969-428-9.

4. Workshops and Symposia

The number of scientific gatherings involving Arabidopsis researchers continues to grow. A summary of those held in 1994 follows.