March 3, 2026

Efficient Citation Screening by Weak Classifier Ensemble*

Puntos clave

Significant performance improvements in citation screening are observed through a weak classifier ensemble approach.
Experiments revealed the method can meet rigid recall requirements, ensuring safer application in practice.
The ensemble method utilizes multiple large language models to create a balanced pseudo-labelled training set.
Addressing the extreme class imbalance challenge is crucial for enhancing accuracy in systematic review screening.

Resumen

Citation screening in systematic review is timeconsuming. Machine learning can help semi-automate it but faces obstacles. Each systematic review is a new dataset without initial annotations. Extreme class imbalance against irrelevant studies makes it difficult to select a good subset of samples to train a classifier. The rigid requirement of a (near) total recall of relevant studies demands a careful trade-off between accuracy and recall. This paper pilots a weak classifier ensemble approach to tackle both challenges. The idea of ensembling is employed in two ways. First, multiple cost-effective large language models are applied and averaged to score and rank candidate studies to create a balanced pseudo-labelled training set. Second, different sets of pseudo-negative samples are bootstrapped from low-rank documents and multiple classifiers are trained and combined to make screening decisions. Experiments on 28 systematic reviews demonstrate significant performance improvements brought by the weakly supervised classifier ensemble, which also meets the rigid recall requirement for it to be safely used in practice.

Me gusta

Guardar

Me gusta

Guardar

Efficient Citation Screening by Weak Classifier Ensemble*

Puntos clave

Resumen

Cite This Study