March 1, 2024

Social and Ethical Norms in Annotation Task Design

Key Points

Key points are not available for this paper at this time.

Abstract

The development of many machine learning (ML) and artificial intelligence (AI) systems depends on human-labeled data. Human-provided labels act as tags or enriching information that enable algorithms to more easily learn patterns in data in order to train or evaluate a wide range of AI systems. These annotations ultimately shape the behavior of AI systems. Given the scale of ML datasets, which can contain thousands to billions of data points, cost and efficiency play a major role in how data annotations are collected. Yet, important challenges arise between the goals of meeting scale-related needs while also collecting data in a way that reflects real-world nuance and variation. Annotators are typically treated as interchangeable workers who provide a view from nowhere. We question assumptions of universal ground truth by focusing on the social and ethical aspects that shape annotation task design.

Mark Helpful

Bookmark

Relay