| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Research Paper |
a The MITRE Corporation, Bedford, MA
b Center for Biomedical Informatics, Harvard Medical School, Boston, MA
c Department of Computer Science, Brandeis University, Waltham, MA
d Stanford Biomedical Informatics, Palo Alto, CA.
* Correspondence and reprints: John Aberdeen, 202 Burlington Road, Bedford, MA 01730 (Email: aberdeen{at}mitre.org).
Received for publication: 03/13/07; accepted for publication: 06/11/07.
Objective: This paper describes a successful approach to de-identification that was developed to participate in a recent AMIA-sponsored challenge evaluation.
Method: Our approach focused on rapid adaptation of existing toolkits for named entity recognition using two existing toolkits, Carafe and LingPipe.
Results: The "out of the box" Carafe system achieved a very good score (phrase F-measure of 0.9664) with only four hours of work to adapt it to the de-identification task. With further tuning, we were able to reduce the token-level error term by over 36% through task-specific feature engineering and the introduction of a lexicon, achieving a phrase F-measure of 0.9736.
Conclusions: We were able to achieve good performance on the de-identification task by the rapid retargeting of existing toolkits. For the Carafe system, we developed a method for tuning the balance of recall vs. precision, as well as a confidence score that correlated well with the measured F-score.
This article has been cited by other articles:
![]() |
M. Bloomrosen and D. Detmer Advancing the Framework: Use of Health Data--A Report of a Working Conference of the American Medical Informatics Association J. Am. Med. Inform. Assoc., November 1, 2008; 15(6): 715 - 722. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. J. Friedlin and C. J. McDonald A Software Tool for Removing Patient Identifying Information from Clinical Documents J. Am. Med. Inform. Assoc., September 1, 2008; 15(5): 601 - 610. [Abstract] [Full Text] [PDF] |
||||
![]() |
O. Uzuner, Y. Luo, and P. Szolovits Evaluating the State-of-the-Art in Automatic De-identification J. Am. Med. Inform. Assoc., September 1, 2007; 14(5): 550 - 563. [Abstract] [Full Text] [PDF] |
||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |