What question did this study set out to answer?

This research aims to improve annotation quality in supervised learning by addressing class imbalance and sparsity in crowdsourced data.

June 19, 2026Open Access

Learning from Crowds Using a Focal Loss Function: Dealing with Imbalanced Annotations

Key Points

This research aims to improve annotation quality in supervised learning by addressing class imbalance and sparsity in crowdsourced data.
Proposed a correlated chained Gaussian process framework trained on a focal-loss-based variational objective (CCGPFL).
Jointly modeled latent ground-truth and instance-dependent annotator reliability while considering correlations among annotators.
Evaluated performance on synthetic, semi-synthetic, and real multi-annotator datasets.
CCGPFL showed competitive performance compared to state-of-the-art baselines in Overall Accuracy (OA).
Achieved superior Area Under the ROC Curve (AUC) in various dataset scenarios.

Abstract

Obtaining high-quality labeled data for supervised learning is costly, motivating the use of crowdsourcing, which distributes the annotation process across multiple workers with varying levels of expertise. A key challenge in crowdsourced data is annotation sparsity, as each worker labels only a limited subset of instances. This sparsity can amplify class imbalance, reduce supervision for minority classes, and bias standard cross-entropy-based models toward the majority classes. To address this problem, we propose a correlated chained Gaussian process framework trained on a focal-loss-based variational objective (CCGPFL). This probabilistic framework jointly models latent ground-truth and instance-dependent annotator reliability while accounting for correlations among annotators. In addition, the focal-weighted objective mitigates the imbalance induced by sparse annotations by assigning greater importance to harder examples during training. Experiments on synthetic, semi-synthetic, and fully real multi-annotator datasets show that CCGPFL achieves competitive and often superior performance relative to state-of-the-art learning-from-crowds baselines in terms of Overall Accuracy (OA) and Area Under the ROC Curve (AUC).

Read Full Paperexternally

Ask AI

Mark Helpful

Bookmark

Relay

View Full Paper