Key points are not available for this paper at this time.
Biased language is prevalent in today's online social media. To reduce the amount of online biased language, one critical first step is to accurately detect such biased language, ideally automatically. This is a challenging problem, however, as the annotated data necessary for training a biased language classifier is either scarce and costly (e.g., when collected from experts), or noisy and potentially biased on their own (e.g., when collected from crowd workers). The biased language classifier built based on these annotations may thus be inaccurate, and sometimes unfair (e.g., have systematic accuracy disparities across texts with different political leanings). In this paper, we propose a novel method, CLEARE, for biased language detection, in which we utilize self-supervised contrastive learning to enhance the biased language classifier---we learn a robust encoder of the textual data through solving a min-max optimization problem, so that the encoder could help achieve the best classification performance even if the worst data augmentation strategy is selected. Extensive evaluations suggest that CLEARE shows substantial improvements compared to the state-of-art biased language detection methods on several benchmark datasets, in terms of improving both the accuracy and the fairness of the detection.
Li et al. (Tue,) studied this question.