509 E 3RD ST
PLANT GENOME RESEARCH RESOURCE
Program Reference Code(s):
1228, 1329, 9109, 9150, 9178, 9179, BIOT
Program Element Code(s):
PI: Volker P. Brendel (Indiana University)
CoPIs: Karin Dorman (Iowa State University), Shannon Schlueter (University of North Carolina - Charlotte) and Shailesh Lal (Oakland University)
Senior Personnel: Jon Duvick and Yasser El-Manzalawy (Iowa State University)
The premise of this project is that the scale of sequence and other data accumulation in plant genomics necessitates the development of novel, highly automated, scalable, comprehensive, and accurate approaches to genome annotation. The depth of transcript data accumulating for many plant species under numerous experimental conditions provide unprecedented evidence for the evaluation of all aspects of transcription, including precise mapping of transcription start sites as well as dominant and alternative splice sites. This project engages a team of experts in a wide range of fields, including genomics, molecular biology, bioinformatics, statistics, machine learning, high performance computing, and software engineering to jointly work toward a solution for accurately predicting the expressed protein-coding gene transcriptome from plant genome sequences. Successful completion of the project will result in the deployment of (1) software that implements the novel prediction algorithms, (2) visualization and data access portals, and (3) a cyberinfrastructure environment implementation of the developed tools for distributed computing, sharing of protocols, and analysis provenance recording. In the long run, the project seeks to explore the extent to which genomic biology can transition from a largely descriptive to a highly predictive science driven by quantitative measurements, with algorithms and computation as the domain-adapted language.
The project will generate standardized, accurate protein-coding gene structure annotation for 25 plant genomes from a wide range of the phylogenetic spectrum. Initial emphasis will be on improved annotation of recently sequenced genomes, which will benefit the entire community of researchers working on these important crops. The anticipated algorithms for transcriptome prediction will be essential to the analysis of the thousands of complete plant genome sequences likely to become available within the next few years. Through the development of reliable gold standard annotations and the dissemination of training and test sets for algorithmic development, a larger community of computational data analysts, in particular from the machine learning community, will be engaged. All software developed and data generated in this research is freely available through project Web sites, in particular www.plantgdb.org. The project's plan for integration of research and education will train a new generation of scientists to work on genomics data with the broad range of interdisciplinary approaches represented by the project team.
PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH
Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
Lynch BT, Patrick TL, Moreno JJ, Siebert AE, Klusman KM, Shodja DN, Hannah LC, and Lal SK. "Differential pre-mRNA Splicing Alters the Transcript Diversity of Helitrons Between the Maize Inbred Lines.," G3: Genes, Genomes, Genetics, v.5, 2015, p. 1703.
Standage DS, Brendel VP. "ParsEval: parallel comparison and analysis of gene structure annotations," BMC Bioinformatics, v.13, 2012, p. 187.
Barbaglia AM, Klusman KM, Higgins J, Shaw JR, Hannah LC, Lal SK. "Gene capture by Helitron transposons reshuffles the transcriptome of maize," Genetics, v.190, 2012, p. 965.
Hypaitia B Rauch, Tara L Patrick, Katarina M Klusman, Fabia U Battistuzzi, Wenbin Mei, Volker P Brendel and Shailesh K Lal. "Discovery and Expression Analysis of Alternative Splicing Events Conserved Among Plant SR Proteins," Molecular Biology and Evolution, v.31, 2014, p. 605.