What question did this study set out to answer?

This research aims to enhance ordinal classification by addressing asymmetric, label-dependent penalties.

March 8, 2026Open Access

Ordinal Classification with Label-Dependent Loss

Key Points

This research aims to enhance ordinal classification by addressing asymmetric, label-dependent penalties.
Extended Bayes-optimal decision rules for ordinal classification under label-dependent loss.
Formalized framework encompassing loss function, scoring rule, and decision criterion.
Analyzed two real-world datasets focusing on interval-scale and cost-sensitive scenarios.
Introduced scoring rule coinciding with expected loss, ensuring proper classification under specific conditions.
Demonstrated the practical implementation of the proposed approach across different classifier families.

Abstract

Abstract Ordinal classification addresses prediction tasks where class labels have a natural order but are not necessarily equally spaced. While traditional approaches typically assume symmetric misclassification costs, many real-world applications exhibit asymmetric, label-dependent penalties. This paper extends previous work on Bayes-optimal decision rules for ordinal classification under symmetric loss (Delgado, 2025) to this more general, cost-sensitive setting. Within a unified decision-theoretic framework, we formalize the interplay between three fundamental components of classification: the loss function , which encodes misclassification severity; the scoring rule , used to evaluate probabilitic predictions and shown here to satisfy regularity and properness; and the decision criterion that maps predictive distributions to class labels. We prove that the proposed scoring rule coincides with the expected loss up to a change of sign –a result of independent interest– and we explicitly characterize sufficient structural conditions under which the resulting decision criterion is well defined and Bayes-optimal. Special attention is given to the interval-scale case, where class distances are explicitly incorporated into the loss, the score, and the decision rule. We show that, depending on the structure of the loss function, Bayes optimality may hold either globally or locally in the space of predictive probability distributions. Empirical results on two real-world datasets, covering both interval-scale and fully ordinal cost-sensitive scenarios and different classifier families, illustrate the practical implications of the proposed approach.

Read Full Paperexternally

Mark Helpful

Bookmark

Relay

View Full Paper