help button home button JAMIA Hate scrolling?
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS

First published November 23, 2004 as JAMIA PrePrint; doi:10.1197/jamia.M1641
This Article
Right arrow Full Text
Right arrow Full Text (PDF)
Right arrow Data Supplement
Right arrow All Versions of this Article:
M1641v1
12/2/207    most recent
Right arrow Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when eLetters are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Aphinyanaphongs, Y.
Right arrow Articles by Aliferis, C. F.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Aphinyanaphongs, Y.
Right arrow Articles by Aliferis, C. F.
J Am Med Inform Assoc. 2005;12:207-216. DOI 10.1197/jamia.M1641.
© 2005 American Medical Informatics Association


Research Paper

Text Categorization Models for High-Quality Article Retrieval in Internal Medicine

Yindalon Aphinyanaphongs, MS, Ioannis Tsamardinos, PhD, Alexander Statnikov, MS, Douglas Hardin, PhD and Constantin F. Aliferis, MD, PhD

Affiliations of the authors: Departments of Biomedical Informatics (YA, IT, AS, CFA) and Mathematics (DH), Vanderbilt University, Nashville TN.

Correspondence and reprints: Yindalon Aphinyanaphongs, MS, Department of Biomedical Informatics, 4th Floor, Eskind Biomedical Library, 2209 Garland Avenue, Vanderbilt University, Nashville, TN 37232.; e-mail: <ping.pong{at}vanderbilt.edu>.

Received for publication: 06/17/04; accepted for publication: 11/17/04.

Objective Finding the best scientific evidence that applies to a patient problem is becoming exceedingly difficult due to the exponential growth of medical publications. The objective of this study was to apply machine learning techniques to automatically identify high-quality, content-specific articles for one time period in internal medicine and compare their performance with previous Boolean-based PubMed clinical query filters of Haynes et al.

Design The selection criteria of the ACP Journal Club for articles in internal medicine were the basis for identifying high-quality articles in the areas of etiology, prognosis, diagnosis, and treatment. Naïve Bayes, a specialized AdaBoost algorithm, and linear and polynomial support vector machines were applied to identify these articles.

Measurements The machine learning models were compared in each category with each other and with the clinical query filters using area under the receiver operating characteristic curves, 11-point average recall precision, and a sensitivity/specificity match method.

Results In most categories, the data-induced models have better or comparable sensitivity, specificity, and precision than the clinical query filters. The polynomial support vector machine models perform the best among all learning methods in ranking the articles as evaluated by area under the receiver operating curve and 11-point average recall precision.

Conclusion This research shows that, using machine learning methods, it is possible to automatically build models for retrieving high-quality, content-specific articles using inclusion or citation by the ACP Journal Club as a gold standard in a given time period in internal medicine that perform better than the 1994 PubMed clinical query filters.




This article has been cited by other articles:


Home page
J. Am. Med. Inform. Assoc.Home page
E. Coiera, J. I. Westbrook, and K. Rogers
Clinical Decision Velocity is Increased when Meta-search Filters Enhance an Evidence Retrieval System
J. Am. Med. Inform. Assoc., September 1, 2008; 15(5): 638 - 646.
[Abstract] [Full Text] [PDF]


Home page
J. Am. Med. Inform. Assoc.Home page
S. V.S. Pakhomov, P. L. Hanson, S. S. Bjornsen, and S. A. Smith
Automatic Classification of Foot Examination Findings Using Clinical Notes and Machine Learning
J. Am. Med. Inform. Assoc., March 1, 2008; 15(2): 198 - 202.
[Abstract] [Full Text] [PDF]


Home page
J. Am. Med. Inform. Assoc.Home page
I. A. McCowan, D. C. Moore, A. N. Nguyen, R. V. Bowman, B. E. Clarke, E. E. Duhig, and M.-J. Fry
Collection of Cancer Stage Data by Classifying Free-text Medical Reports
J. Am. Med. Inform. Assoc., November 1, 2007; 14(6): 736 - 745.
[Abstract] [Full Text] [PDF]


Home page
J. Am. Med. Inform. Assoc.Home page
Y. Lin, W. Li, K. Chen, and Y. Liu
A Document Clustering and Ranking System for Exploring MEDLINE Citations
J. Am. Med. Inform. Assoc., September 1, 2007; 14(5): 651 - 661.
[Abstract] [Full Text] [PDF]


Home page
J. Am. Med. Inform. Assoc.Home page
X. Lu, B. Zheng, A. Velivelli, and C. Zhai
Enhancing Text Categorization with Semantic-enriched Representation and Training Data Augmentation
J. Am. Med. Inform. Assoc., September 1, 2006; 13(5): 526 - 535.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
B. Han, Z. Obradovic, Z.-Z. Hu, C. H. Wu, and S. Vucetic
Substring selection for biomedical document classification
Bioinformatics, September 1, 2006; 22(17): 2136 - 2142.
[Abstract] [Full Text] [PDF]


Home page
J. Am. Med. Inform. Assoc.Home page
A.M. Cohen, W.R. Hersh, K. Peterson, and P.-Y. Yen
Reducing Workload in Systematic Review Preparation Using Automated Citation Classification
J. Am. Med. Inform. Assoc., March 1, 2006; 13(2): 206 - 219.
[Abstract] [Full Text] [PDF]


Home page
J. Am. Med. Inform. Assoc.Home page
E. V. Bernstam, J. R. Herskovic, Y. Aphinyanaphongs, C. F. Aliferis, M. G. Sriram, and W. R. Hersh
Using Citation Data to Improve Retrieval from MEDLINE
J. Am. Med. Inform. Assoc., January 1, 2006; 13(1): 96 - 105.
[Abstract] [Full Text] [PDF]


Home page
J. Am. Med. Inform. Assoc.Home page
N. L. Wilczynski and R. B. Haynes
Optimal Search Strategies for Detecting Clinically Sound Prognostic Studies in EMBASE: An Analytic Survey
J. Am. Med. Inform. Assoc., July 1, 2005; 12(4): 481 - 485.
[Abstract] [Full Text] [PDF]




HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
Copyright © 2005 by the American Medical Informatics Association.