The increasing reliance on data-driven decision-making across public policy, business, and scientific research has brought unprecedented opportunities for evidence-based strategies and resource allocation. However, this reliance also exposes decision processes to significant risks when the underlying data is biased. Data bias—whether introduced through sampling, measurement, selection, or reporting—can systematically distort the representation of populations, events, or phenomena. Such distortions not only compromise the accuracy of analyses but also have the potential to reinforce existing social inequities, misinform policy, and erode public trust in institutions and technologies. This paper provides a comprehensive examination of the origins and impacts of data bias, drawing on a wide range of literature and real-world case studies from domains such as social policy and finance. It explores the mechanisms by which bias infiltrates data, the consequences for decision outcomes, and the ways in which these outcomes can perpetuate cycles of disadvantage. In response, the paper outlines a framework for identifying, mitigating, and governing data bias, emphasizing the need for transparency, stakeholder engagement, and robust ethical oversight. The findings underscore the urgent necessity for systemic reforms in data collection and analysis to ensure that data-driven decisions are both equitable and effective.
Dhenia et al. (Thu,) studied this question.