A unified incremental capacity analysis framework enables extraction of 36 features from heterogeneous first-life and second-life lithium-ion battery datasets, supporting consistent SOH band classification across variable chemistries and formats.
A reproducible framework using incremental capacity analysis and unsupervised learning was developed to process heterogeneous battery datasets and generate structure-aware meta-features for state of health classification.
Accurately assessing battery health across mixed datasets remains a challenge due to differences in chemistry, format, and usage history. This study presents a reproducible framework for preparing battery cycling data using incremental capacity analysis (ICA), with the aim of supporting machine learning (ML) workflows across both first-life and second-life battery datasets. The methodology includes IC curve generation, feature extraction, encoding and scaling, feature reduction, and unsupervised learning exploration. A two-tiered outlier detection system was introduced during preprocessing to flag edge-case samples. Two clustering algorithms, K-means and HDBSCAN, were applied to the engineered feature space to explore patterns in the IC feature space. K-means revealed broad health-related groupings with overlapping boundaries, while HDBSCAN identified finer clusters and flagged additional ambiguous samples as noise. To support interpretation, PCA and t-SNE were used to visualise the feature space in reduced dimensions. Rather than using clustering as a classification tool, the resulting cluster and noise labels are proposed as structure-aware meta-features for supervised learning. The framework accommodates heterogeneous battery datasets and addresses the challenges of integrating data from mixed sources with varying histories and characteristics. These outputs provide a structured foundation for future supervised classification of battery state of health.
Beatty et al. (Fri,) conducted a other in Batteries including first-life lithium-ion pouch cells, cylindrical lithium iron phosphate cells, and second-life lithium-ion modules with blended lithium manganese oxide and lithium nickel oxide chemistries, with varied formats, capacities, and unknown first-life histories for second-life modules. Incremental capacity analysis (ICA) based feature engineering pipeline applied to heterogeneous battery datasets vs. Not applicable was evaluated on State of health (SOH) classification into discrete bands based on incremental capacity features derived from battery charge cycle data. A unified incremental capacity analysis framework enables extraction of 36 features from heterogeneous first-life and second-life lithium-ion battery datasets, supporting consistent SOH band classification across variable chemistries and formats.