March 3, 2026Open Access

FORCE: Feature-Oriented Representation with Clustering and Explanation

Key Points

FORCE improves predictive model performance significantly, especially in identifying important latent structures.
Analysis revealed an F1 score improvement from 0.80 to 0.99 in detecting polycystic ovarian syndrome.
The approach employs a two-stage usage of SHAP values to optimize neural network training through clustering.
Incorporating an attention mechanism enhances the discriminative capability within the deep learning framework.

Abstract

Capitalizing latent unobserved structures to improve the accuracy of predictive models has become an active avenue for deep learning research. Most approaches cluster original features to infer latent structures. However, the information gained during the process can often be implicitly derived using sufficiently complex models. Thus, these approaches often provide minimal benefits. We propose a Shapley Additive exPlanations-based supervised deep learning framework called FORCE (Feature Oriented Representation with Clustering and Explanation), which relies on two-stage usage of SHAP values in the neural network architecture, (i) a latent embedding to guide model training based on clustering absolute SHAP values, and (ii) initiating an attention mechanism within the architecture using latent information. This approach gives a neural network an indication of the effect of unobserved values that modify the importance of a feature for an observation. The proposed framework was evaluated using three real-life datasets. Our results demonstrate that FORCE led to dramatic improvements in overall performance as compared to networks that did not incorporate the latent feature and attention framework (for example, F1 score for the presence of Polycystic Ovarian Syndrome 0.99 vs 0.80). Using cluster assignments and attention based on SHAP values guides deep learning, enhancing latent pattern learning and overall discriminative capability.

FORCE: Feature-Oriented Representation with Clustering and Explanation

Key Points

Abstract

Cite This Study