Abstract Machine Learning (ML) and Artificial Intelligence (AI) approaches have potential to make better-informed decisions in chemical hazard identification while reducing animal testing. Their application in the context of New Approach Methodologies (NAMs) for Hazard Identification in Chemicals Risk Assessment (CRA) is challenging due to the limited knowledge, lack of experience, and uncertainty related to the use of these approaches. Therefore, to facilitate ML and AI approaches' potential acceptance for regulatory use, better standardization, guidelines for transparent reporting, validation, and frameworks are needed to understand their accessibility, verifiability, and usefulness criteria for predictions. An extensive literature review on the availability of ML and AI based NAMs for chemical hazard identification was conducted, focusing primarily on human health endpoints: specific target organ toxicity (STOT), genotoxicity and carcinogenicity, endocrine disruption, skin sensitization, developmental and reproductive toxicity (DART), and repeated dose or chronic toxicity. Nearly 2300 scientific articles were reviewed, and 274 publications with ML-QSAR models revealed that 60.9% of the models described in the scientific literature turned out to be non-usable, 21.9% were potentially usable, and 17.2% were directly usable, i.e., had available software solutions. By endpoint, the skin sensitization is best covered with the ML-QSAR models, followed by endocrine disruption, genotoxicity, and carcinogenicity models. The most derived ML-QSAR models are tree-based models such as random forests, and analogues, followed by artificial neural networks and support vector machine models, with other models being used to a lesser extent. The literature analysis led to a framework that helps model users to identify potentially suitable models for use in a regulatory context. In addition, the framework could help model developers better understand the expectations of model users in a regulatory context and use the framework as a reference when publishing their models, ensuring greater transparency, alignment with regulatory needs, and facilitating future acceptance.
Piir et al. (Thu,) studied this question.