Objective 4. Bioinformatics in Every Plant Scientistís Research Tool Box
NPGI has produced, and will continue to produce, enormous amounts of plant genome data, which need to be made accessible to a broad community of scientists in a useable form. A measure of the success of the NPGI will be in the direct usefulness of genome information. Significant and broad efforts should be directed toward programs that enable individuals or groups to access, analyze and compare data. The engineering of information systems, the development of data-mining tools, and the creation of computation based predictive models for functional analysis will continue to advance the goals of the NPGI.
Develop informatics tools to access and use plant genome databases
High throughput genomics technologies have led to a flood of sequence information, gene expression array data, and map data. All databases must have capabilities that will allow the broadest access by the community. Open access will lead to the widest utilization of the data and the development of innovative and more sophisticated tools, which, in turn, will enable individuals or groups to access and query all the available, current and future resources in the most imaginative ways possible.
The plant community should try to utilize existing tools for informatics to the largest extent possible. For example, the Generic Model Organism Project (GMOD) is a joint project between the National Human Genome Research Institute and the Agricultural Research Service, and aims to develop generic software modules for common function of a model organism genome data. Networking with these and other similar efforts will be an efficient and cost-effective way to leverage investments already made.
Build community databases with standards for interoperability
Databases should be developed that incorporate a common set of standards and interfaces in order for individuals located anywhere in the world to make full use of all publicly available resources. One way to make databases become interoperable is through the development of controlled vocabularies. The plant community (e.g., Arabidopsis and maize) is already actively participating in the Gene Ontology (GO) consortium that is developing controlled vocabularies for all model genomes. This effort should be encouraged for any new community and datasets
Institute an internationally coordinated data repository mechanism