We present cl-metrics, a stateless, architecture-agnostic Python library that computes standard Continual Learning (CL) and Class-Incremental Learning (CIL) evaluation metrics from a raw per-task accuracy matrix — with no dependency on any training framework. The library fills a well-documented gap: dominant CIL frameworks (Avalanche, PyCIL, FACIL, Sequoia) embed metric computation inside their training loops, making it impossible to compute standard metrics from a pre-generated accuracy matrix without writing framework-specific wrapper code. cl-metrics is the scikit-learn.metrics equivalent for CIL evaluation. Metrics implemented (canonical formulations): Average Accuracy (AA), Backward Transfer (BWT), Forward Transfer (FWT), Intransigence, Plasticity Index, Stability Index, Forgetting Measure. SNN Energy-Aware Extensions (first standardised suite): Spike Rate Proxy (SRP), Spike-Rate Normalised Average Accuracy (SR-AA), Energy-Adjusted Backward Transfer (EA-BWT), Energy-to-Error Ratio (EER). All metrics are validated against the Maya Research Series (Swaminathan, 2026a–2026g), a seven-paper neuromorphic SNN CIL benchmark on Split-CIFAR-10 and Split-CIFAR-100. The library ships with 21 unit tests, all passing in under 0.15 seconds. Documentation includes a fully bilingual FAQ (English and Mandarin Chinese) at https://venky2099.github.io/cl-metrics/faq.html, reflecting the international research community that has engaged with this work. Software DOI: 10.5281/zenodo.19388144GitHub: https://github.com/venky2099/cl-metrics
Venkatesh Swaminathan (Thu,) studied this question.