Los puntos clave no están disponibles para este artículo en este momento.
Interpretability plays a vital role in understanding complex deep learning models by providing transparency and insights. It addresses the black-box nature of these models, aids in human-in-the-loop systems, enhances model development, and supports education. However, existing interpretability algorithms in computer vision primarily target images, leaving a gap for video applications. In this article, we emphasize the importance of interpretability in video-based Human Action Recognition (HAR). We extend existing 2D interpretability techniques to the 3D domain, specifically focusing on saliency maps and gradient-weighted class activation maps. The proposed interpretability system is then employed in the analysis of a well-known HAR dataset to better understand action recognition in videos.
Fernandez et al. (Tue,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: