Key points are not available for this paper at this time.
As the volume of clinic notes written in natural language is rapidly increasing, physicians need a tool to automatically extract information about diseases/treatments. The main problem in extracting medical information is that physicians use variant words to describe the same disease/treatment. In order to help physicians interpret and share disease/treatment information in clinic notes, we need to reliably and effectively detect and normalize the medical terms. In this study, we perform detection/normalization of medical terms using a UMLS meta-thesaurus combined with a document retrieval technique. We regard a medical sentence as a query, and a UMLS ontology entry as a document, and try to apply a language modeling-based information retrieval method as currently used in the document retrieval field. Because the term frequency in the UMLS dictionary is uniform, we employ a domain-specific term frequency instead of traditional term frequency. To retrieve only the relevant terms in 900,000 UMLS entries, we also propose an adaptive ranking method which dynamically determines the relevant documents for each query without using static cut-off threshold. The experimental results outperform the previous methods in detecting and normalizing medical terms in Medline clinical trials, and our approach can be used in normalizing the real diagnosis list in the patient charts of physicians.
Kim et al. (Mon,) studied this question.