What question did this study set out to answer?

This work aims to address the limitations of existing frameworks in evaluating continual and class-incremental learning metrics.

April 4, 2026Open Access

cl-metrics: A Stateless Python Library for Continual Learning Evaluation with SNN Energy-Aware Extensions

Key Points

This work aims to address the limitations of existing frameworks in evaluating continual and class-incremental learning metrics.
Developed a stateless, architecture-agnostic Python library for metric computation.
Implemented metrics such as average accuracy, backward transfer, and energy-aware extensions.
Validated against a neuromorphic SNN benchmark with a raw per-task accuracy matrix.
Included unit tests for reliability and documentation in English and Mandarin.”],
Successfully computes standard metrics without dependency on training frameworks.
Metrics validated against Split-CIFAR-10 and Split-CIFAR-100 benchmarks.
All unit tests passed in under 0.15 seconds ensuring reliability.
Documentation is bilingual, enhancing accessibility for the global research community.

Abstract

We present cl-metrics, a stateless, architecture-agnostic Python library that computes standard Continual Learning (CL) and Class-Incremental Learning (CIL) evaluation metrics from a raw per-task accuracy matrix — with no dependency on any training framework. The library fills a well-documented gap: dominant CIL frameworks (Avalanche, PyCIL, FACIL, Sequoia) embed metric computation inside their training loops, making it impossible to compute standard metrics from a pre-generated accuracy matrix without writing framework-specific wrapper code. cl-metrics is the scikit-learn.metrics equivalent for CIL evaluation. Metrics implemented (canonical formulations): Average Accuracy (AA), Backward Transfer (BWT), Forward Transfer (FWT), Intransigence, Plasticity Index, Stability Index, Forgetting Measure. SNN Energy-Aware Extensions (first standardised suite): Spike Rate Proxy (SRP), Spike-Rate Normalised Average Accuracy (SR-AA), Energy-Adjusted Backward Transfer (EA-BWT), Energy-to-Error Ratio (EER). All metrics are validated against the Maya Research Series (Swaminathan, 2026a–2026g), a seven-paper neuromorphic SNN CIL benchmark on Split-CIFAR-10 and Split-CIFAR-100. The library ships with 21 unit tests, all passing in under 0.15 seconds. Documentation includes a fully bilingual FAQ (English and Mandarin Chinese) at https://venky2099.github.io/cl-metrics/faq.html, reflecting the international research community that has engaged with this work. Software DOI: 10.5281/zenodo.19388144GitHub: https://github.com/venky2099/cl-metrics

Read Full Paperexternally

Mark Helpful

Bookmark

Relay

View Full Paper