Capitalizing latent unobserved structures to improve the accuracy of predictive models has become an active avenue for deep learning research. Most approaches cluster original features to infer latent structures. However, the information gained during the process can often be implicitly derived using sufficiently complex models. Thus, these approaches often provide minimal benefits. We propose a Shapley Additive exPlanations-based supervised deep learning framework called FORCE (Feature Oriented Representation with Clustering and Explanation), which relies on two-stage usage of SHAP values in the neural network architecture, (i) a latent embedding to guide model training based on clustering absolute SHAP values, and (ii) initiating an attention mechanism within the architecture using latent information. This approach gives a neural network an indication of the effect of unobserved values that modify the importance of a feature for an observation. The proposed framework was evaluated using three real-life datasets. Our results demonstrate that FORCE led to dramatic improvements in overall performance as compared to networks that did not incorporate the latent feature and attention framework (for example, F1 score for the presence of Polycystic Ovarian Syndrome 0.99 vs 0.80). Using cluster assignments and attention based on SHAP values guides deep learning, enhancing latent pattern learning and overall discriminative capability.
Mukherjee et al. (Mon,) studied this question.