What question did this study set out to answer?

This research aims to improve knowledge extraction from small preclinical datasets using a generative AI method.

March 16, 2026Open Access

Self-organizing neural network-based generative AI with embedded error inflation control enhances effective knowledge extraction from preclinical studies with reduced sample size

Key Points

This research aims to improve knowledge extraction from small preclinical datasets using a generative AI method.
Introduced genESOM based on self-organizing maps to augment small datasets.
Controlled α-error inflation and integrated error mitigation strategies.
Applied the method to lipid signaling data in a multiple sclerosis preclinical model.
Reducing sample size from 26 to 18 eliminated detectable group differences.
AI-generated cases restored treatment-specific segregation and key lipid mediators.
genESOM maintained high fidelity without introducing false positives, outperforming other models.

Abstract

Small sample sizes in preclinical research limit the extraction of reliable knowledge and hinder translational progress. We propose genESOM, a generative artificial intelligence method based on emergent self‑organizing maps. genESOM is designed to augment small biomedical datasets while controlling α‑error inflation. It separates structure learning from data synthesis and integrates error propagation mitigation through dimensionality modulation, enabling safe and interpretable data augmentation. Using lipid signaling data from a preclinical multiple sclerosis study employing the experimental autoimmune encephalomyelitis (EAE) model (26 female SJL/J mice, three treatment groups, and 62 lipid mediators), we intentionally reduced the sample size from 26 to 18 animals. This reduction abolished detectable group differences by both statistical and machine learning analyses. Augmenting the reduced dataset with AI‑generated cases restored treatment‑specific segregation and recovered the original key lipid mediators. genESOM achieved consistent fidelity without introducing false positives. In contrast, Gaussian mixture and conditional GAN models failed under comparable constraints. These results demonstrate that genESOM provides a robust, error‑controlled framework for enhancing knowledge extraction from limited preclinical samples. While synthetic augmentation cannot substitute for biological replication, it can support exploratory analyses and help reduce the need for additional animal experimentation.

Self-organizing neural network-based generative AI with embedded error inflation control enhances effective knowledge extraction from preclinical studies with reduced sample size

Key Points

Abstract

Cite This Study