July 25, 2010

Active learning for biomedical citation screening

Key Points

Key points are not available for this paper at this time.

Abstract

Active learning (AL) is an increasingly popular strategy for mitigating the amount of labeled data required to train classifiers, thereby reducing annotator effort. We describe a real-world, deployed application of AL to the problem of biomedical citation screening for systematic reviews at the Tufts Medical Center's Evidence-based Practice Center. We propose a novel active learning strategy that exploits a priori domain knowledge provided by the expert (specifically, labeled features)and extend this model via a Linear Programming algorithm for situations where the expert can provide ranked labeled features. Our methods outperform existing AL strategies on three real-world systematic review datasets. We argue that evaluation must be specific to the scenario under consideration. To this end, we propose a new evaluation framework for finite-pool scenarios, wherein the primary aim is to label a fixed set of examples rather than to simply induce a good predictive model. We use a method from medical decision theory for eliciting the relative costs of false positives and false negatives from the domain expert, constructing a utility measure of classification performance that integrates the expert preferences. Our findings suggest that the expert can, and should, provide more information than instance labels alone. In addition to achieving strong empirical results on the citation screening problem, this work outlines many important steps for moving away from simulated active learning and toward deploying AL for real-world applications.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Byron Wallace

Instituto de Engenharia de Sistemas e Computadores Investigação e Desenvolvimento

Kevin Small

Tufts University

Carla E. Brodley

Northeastern University

Actions

Institutions

Tufts University

Tufts Medical Center

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Active learning for biomedical citation screening

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study