help button home button JAMIA Hate scrolling?
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS

First published February 24, 2006 as JAMIA PrePrint; doi:10.1197/jamia.M1848
This Article
Right arrow Full Text
Right arrow Full Text (PDF)
Right arrow All Versions of this Article:
M1848v1
13/3/289    most recent
Right arrow Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when eLetters are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Kurc, T.
Right arrow Articles by Saltz, J. H.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Kurc, T.
Right arrow Articles by Saltz, J. H.
J Am Med Inform Assoc. 2006;13:289-301. DOI 10.1197/jamia.M1848.
© 2006 American Medical Informatics Association


Application of Information Technology

An XML-based System for Synthesis of Data from Disparate Databases

Tahsin Kurc, PhD, Daniel A. Janies, PhD, Andrew D. Johnson, BS, Stephen Langella, MS, Scott Oster, MS, Shannon Hastings, MS, Farhat Habib, MS, Terry Camerlengo, BS, David Ervin, BS, Umit V. Catalyurek, PhD and Joel H. Saltz, MD, PhD

Affiliations of the authors: Department of Biomedical Informatics, Ohio State University, Columbus, OH.

Correspondence and reprints: Tahsin Kurc, PhD, Biomedical Informatics Department, Ohio State University, 3184 Graves Hall, 333 West 10th Avenue, Columbus, OH 43210; e-mail: <kurc{at}bmi.osu.edu>.

Received for publication: 04/11/05; accepted for publication: 01/29/06.

Diverse data sets have become key building blocks of translational biomedical research. Data types captured and referenced by sophisticated research studies include high throughput genomic and proteomic data, laboratory data, data from imagery, and outcome data. In this paper, the authors present the application of an XML-based data management system to support integration of data from disparate data sources and large data sets. This system facilitates management of XML schemas and on-demand creation and management of XML databases that conform to these schemas. They illustrate the use of this system in an application for genotype–phenotype correlation analyses. This application implements a method of phenotype–genotype correlation based on phylogenetic optimization of large data sets of mouse SNPs and phenotypic data. The application workflow requires the management and integration of genomic information and phenotypic data from external data repositories and from the results of phenotype–genotype correlation analyses. Our implementation supports the process of carrying out a complex workflow that includes large-scale phylogenetic tree optimizations and application of Maddison's concentrated changes test to large phylogenetic tree data sets. The data management system also allows collaborators to share data in a uniform way and supports complex queries that target data sets.




This article has been cited by other articles:


Home page
J. Am. Med. Inform. Assoc.Home page
E. P. Shironoshita, Y. R. Jean-Mary, R. M. Bradley, and M. R. Kabuka
semCDI: A Query Formulation for Semantic Data Integration in caBIG
J. Am. Med. Inform. Assoc., July 1, 2008; 15(4): 559 - 568.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
F. Habib, A. D. Johnson, R. Bundschuh, and D. Janies
Large scale genotype phenotype correlation analysis based on phylogenetic trees
Bioinformatics, April 1, 2007; 23(7): 785 - 788.
[Abstract] [Full Text] [PDF]




HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
Copyright © 2006 by the American Medical Informatics Association.