The addition of species genetic traits into QSAR models enhances accurate cross-species toxicity predictions. Building on this idea, we developed a Bio-QSAR model that interprets genetic similarity as a proxy for common physiological responses and integrates chemical and genetic features within a neural network model. The main advantage is that the species embedding is continuous and thus allows for a higher degree of generalization than categories based on taxonomy, improving predictions for new, untested species. We conducted a comparative analysis of our model’s performance against that of recently published Bio-QSAR models. The results indicated that our model achieved a similar level of predictive accuracy, demonstrating its competitiveness within the current state-of-the-art methodologies. Rather than selecting the best-performing model as the flagship, we opted to present the distribution of performance metrics along with their average across repeated cross-validation. Additionally, we report the range of these metrics to facilitate comparisons with other models. Finally, we analyzed the model outputs to identify any potential sampling biases in datasets, highlighting extremes for species and chemical toxic response. Our results demonstrated that this species embedding is a highly effective approach for read-across scenarios in common ecotoxicological datasets with high data sparsity. In the current state, this model can be used to fill gaps in datasets or improve environmental risk assessment with more data, and direct prioritization of new tests for different species or chemicals. These future improvements will allow for more accurate predictions on completely new species, being it a lab-reared, rare, or indigenous species. • Ecotoxicological datasets are sparse, many chemical effects on species are untested. • Categorical models are not flexible for categories outside the training set. • Using a species continuous embedding allows for extrapolation over new species. • Combining species and chemical similarity features provide a robust model framework. • Genetic similarity in QSAR models provides good predictions outside the training set.
Building similarity graph...
Analyzing shared references across papers
Loading...
Forastiere et al. (Mon,) studied this question.
synapsesocial.com/papers/69c37afeb34aaaeb1a67d112 — DOI: https://doi.org/10.1016/j.ecoenv.2026.120057
Mirko Forastiere
Leiden University
Martina G. Vijver
Leiden University
Leo Posthuma
Radboud University Nijmegen
Ecotoxicology and Environmental Safety
Radboud University Nijmegen
Leiden University
National Institute for Public Health and the Environment
Building similarity graph...
Analyzing shared references across papers
Loading...