Continual learning systems suffer from catastrophic forgetting: sequential training on new tasks overwrites knowledge acquired from previous ones. This paper presents Novelty-triggered Capacity Growth (NCG), a training procedure that addresses forgetting through two coupled mechanisms. First, three scalar meta-parameters α, β, λ are trained jointly with network weights via gradient ascent on a Lagrangian-style meta-loss, allowing the model to self-regulate its own exploration, complexity penalty, and regularisation strength without any external schedule. Second, a novelty signal derived from hidden-layer activation entropy triggers architectural growth — adding 64 hidden units — precisely when three conditions coincide: the model has adapted to the current task distribution, regularisation pressure exceeds a threshold, and validation accuracy has plateaued. A gated knowledge embedding K (t) accumulates task representations across time, providing a persistent memory substrate independent of parameter drift. On Split-MNIST, NCG reduces catastrophic forgetting by 21% relative to a parameter-matched static baseline (p = 0. 012, Welch t-test, n = 10 seeds). On Split-CIFAR-10, the reduction is 64% (p < 0. 0001), and NCG outperforms EWC on that harder benchmark. Code: https: //github. com/rsd-darshan/NCG
Darshan Poudel (Sat,) studied this question.