| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH |
Submitted on December 20, 2007
Accepted on May 30, 2008
Affiliation of the authors: 1 Regenstrief Institute, Indianapolis, IN; 2 Formerly of the Regenstrief Institute, Indianapolis, IN; Currently of the Lister Hill Center, Bethesda, MD
* To whom correspondence should be addressed.
We desired to create a software tool that accurately removes all patient identifying information from various kinds of clinical data documents, including laboratory and narrative reports. We created the Medical De-identification System (MeDS)- a software tool that de-identifies clinical documents and performed two evaluations. Our first evaluation used 2,400 Health Level Seven (HL7) messages from 10 different HL7 message producers. After modifying the software based on the results of this first evaluation, we performed a second evaluation using 7,190 pathology report HL7 messages. We compared the results of MeDS de-identification process to a gold standard of human review to find identifying strings. For both evaluations, we calculated the number of successful scrubs, missed identifiers, and over-scrubs committed by MeDS and evaluated the readability and interpretability of the scrubbed messages. We categorized all missed identifiers into three groups: 1) Complete HIPAA specified identifiers 2) HIPAA specified identifiers. Approximately 95% of scrubbed messages were both readable and interpretable. We conclude MeDS successfully de-identified a wide range of medical documents from numerous sources and creates scrubbed reports that retain their interpretability, thereby maintaining their usefulness for research.
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH |