• DemoFace provides a demographically balanced pixelated real face image dataset to mitigate biases in face biometric systems. • The dataset’s structured image–text embedding multimodality supports downstream tasks and facilitates the analysis of model biases in CVFMs. • Through our novel Responsible AI methodology, we developed the dataset and established a set of baselines using SOTA CVFMs to assess the performance and fairness of CVFMs in face biometric tasks. • Through our evaluation using both new and adapted metrics, we revealed inherent bias patterns in several SOTA CVFMs, providing valuable insights to guide future research. Bias and fairness are critical challenges in data-driven computer vision (CV), where limited demographic diversity in training data worsens these challenges. Face biometric (face recognition) systems are core tasks of CV that are highly impacted by these challenges, as existing real-face datasets lack comprehensive demographic representation, whereas current synthetic datasets promote stereotypes. CV Foundation Models (CVFMs) are currently at the forefront of CV applications, including face biometrics, which use global features in multimodal data. However, the scarcity of large-scale, demographic multimodal datasets, such as image-text embeddings for model fine-tuning (or training), limits the fairness in state-of-the-art (SOTA) CVFMs for downstream face biometric tasks. To address these issues, we introduce DemoFace, a balanced demographic face dataset comprising 30,240 pixelated real face images of 672 representative individuals evenly distributed across 48 demographic groups, categorized by ethnicity/race, gender, and age. We gathered images using an API set up from multiple copyright-free public forums. The collected images were then manually filtered, anonymized, and annotated by two independent research groups, and then lightly pixelated for privacy preservation. DemoFace’s image-text embedding multimodality enables fine-tuning (or training) of CVFMs for fairness-focused face biometrics tasks and bias pattern evaluation. Through two empirical studies: face authentication as classification and textual description as token generation, we established baseline scores across ethnicity/race, gender, and age groups. Our baselines identified inherent bias patterns through both new and tailored metrics derived from existing ones, emphasizing the need for more equitable AI models. Here is the Repository: Link
Sufian et al. (Sat,) studied this question.