We introduce AIDS (Adaptive Importance-Driven Selection), an on-device data selection mechanism that determines which user interactions are worth learning from before a single gradient step is taken. The core insight is that naive continual fine-tuning of on-device LLMs fails not because of insufficient data, but because of indiscriminate data: low-signal samples dominate the training buffer, wasting compute and causing parameter drift that erodes general capabilities. AIDS assigns each incoming sample a composite importance score—combining novelty under the current model, semantic consistency with established user patterns, temporal recency, and signal-source reliability—and admits only high-scoring samples into a fixed-capacity selective buffer. A drift detector monitors general-capability health every k sessions and triggers rollback or magnitude pruning when degradation is detected. On a 90-day longitudinal study with 50 participants, AIDS reduces user-text perplexity by 34% over the base model while maintaining health H≥0.45 throughout, simultaneously outperforming uniform-replay LoRA on both metrics. Index Terms—adaptive data selection, importance scoring, continual learning, LoRA, drift detection, on-device personalization, edge AI.
Vishwajeet shashikant adkine (Thu,) studied this question.