The ever-expanding biomedical literature necessitates an efficient and robust mining platform, with the foundational step being a reliable Biomedical Named Entity Recognition (BioNER) system. Existing approaches, such as multi-task and collaborative learning, have attempted to address dataset heterogeneity but often rely on complex architectures with task-specific layers, limiting scalability. A key research gap is the development of a unified model that optimises across multiple datasets without sacrificing performance or introducing architectural complexity. In this study, we propose a novel Loss-Masking Optimisation framework for BioNER models that enables multi-dataset training via a dataset-aware masking strategy. This approach extends the standard BERT-based NER pipeline by introducing a tag-masking array that nullifies logits for tags absent in the originating dataset, thereby reducing cross-dataset interference. Using this methodology, we trained a single BioNER model across all 16 biomedical NER datasets, achieving higher precision and overall F1 scores than conventional multi-dataset training. While some datasets showed performance gains, others stayed near baseline, and a few declined, underscoring the nuanced impact of dataset interactions. To the best of our knowledge, this is among the first studies to apply a dataset-aware loss-masking mechanism to unified multi-dataset BioNER training, offering a scalable alternative to multi-task architectures.
Alphonse et al. (Tue,) studied this question.