Almost 1 in 5 Americans develop skin cancer by the age of 70, and it is the most commonly diagnosed cancer in the United States, with most cases being preventable. Artificial intelligence (AI) has shown great potential to accelerate the diagnosis of skin cancer for early treatment. However, the fairness of using AI for skin cancer detection has raised concerns due to the lower accuracy of "darker" skin tone detection. This paper conducts a comprehensive study on the bias problem using the fitzpatrick dataset, while analyzing the effects of different methods towards mitigating this bias. Throughout our experiments, we found that not only was the darkest skin type biased, one of the specific light skin types also had a relatively low accuracy. Five AI models were evaluated and compared in terms of bias, and four augmentation techniques were applied to mitigate bias. Moreover, we studied the impact of training parameters (e.g. batch size, data splitting) on bias.
Zhang et al. (Sat,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: