What question did this study set out to answer?

This research aims to enhance the early stopping mechanism in training by adapting it to individual data instances, improving computational efficiency and performance.

May 16, 2026

Instance-dependent Early Stopping for Adaptive Data Pruning

Key Points

This research aims to enhance the early stopping mechanism in training by adapting it to individual data instances, improving computational efficiency and performance.
Proposed Instance-dependent Early Stopping (IES) to evaluate individual instance mastery based on loss value changes.
Enhanced variant, IES+, introduced for optimizing forward pass efficiency further.
Evaluated IES in the context of fine-tuning large language models to assess performance improvements.
IES reduces instances receiving backpropagation by 10%-50%, improving training efficiency while maintaining performance.
IES+ achieves state-of-the-art reductions in training time for models with accelerated forward propagation.
Validated effectiveness of IES in supervised fine-tuning contexts, achieving notable computational savings.

Abstract

Early stopping has been widely used to regularize models and can reduce the amount of computation by halting the training process when the performance of the model on a validation set stops improving. However, conventional early stopping applies the same stopping criterion to all instances without considering their individual learning status, which can leads to both potential overfitting and redundant computational costs on instances that are already well learned. To further improve efficiency, we propose Instance-dependent Early Stopping (IES), which adapts the early stopping mechanism from the entire training set to the instance level, based on the core principle that once the model has mastered an instance, the training on it should stop. IES considers an instance mastered if the second-order differences of its loss value remain within a small range around zero. This provides a uniform stopping criterion that is applicable across all instances, unlike a simple loss value threshold which is affected by sample difficulty. We show that excluding mastered instances from backpropagation can increase gradient norms, thereby accelerating the decrease in the training loss and speeding up the training process. To address the remaining overhead in forward propagation, we introduce an enhanced variant, IES+, designed for aggressive training acceleration. The foundational IES accelerates training, reducing the number of instances receiving backpropagation by 10%-50% while maintaining or even improving performance. For scenarios where speed is the top priority, IES+ further optimizes the forward pass to achieve state-of-the-art reductions in wall-clock time. Furthermore, we extend our evaluation to validate the effectiveness of IES for the supervised fine-tuning of large language models, where it achieves notable computational savings while preserving or improving performance.

Bookmark

Cite This Study

Yuan et al. (Thu,) studied this question.

synapsesocial.com/papers/6a080b27a487c87a6a40d44e https://doi.org/https://doi.org/10.1109/tpami.2026.3693108

Bookmark