What question did this study set out to answer?

The aim is to improve the explainability and reliability of CNN-based damage detection in wooden houses during earthquakes.

March 26, 2026Open Access

An Explainable Convolutional Neural Network Using Human-Attention-Guided Feature Extraction for Earthquake Damage Detection in Wooden Houses

Key Points

The aim is to improve the explainability and reliability of CNN-based damage detection in wooden houses during earthquakes.
Developed a ResNet-50 detector with feature-map visualization modules (FVMs).
Employed human-attention alignment loss and masks for feature extraction.
Utilized a diagnostic paradigm to trace feature usage in decision-making.
Achieved significant improvements in explainability and human-machine alignment.
Demonstrated only a minor performance loss compared to traditional methods.
Successfully visualized the evolution of feature maps throughout the detection process.

Abstract

The lack of explainability in intermediate processing in convolutional neural network (CNN)-based wooden-house damage detection for Japan’s Earthquake Damage Certification (EDC) survey can undermine homeowners’ trust and hinder practical adoption. To address this issue, this research proposes an explainable, diagnosable ResNet-50 detector that uses: feature-map visualization modules (FVMs) to visualize feature-map representations; a human-attention alignment loss and human-attention masks that supervise the network to extract features attended to by human experts; and a corresponding diagnostic paradigm. The test results indicate that the explainability and human–machine alignment of the ResNet-50 detector are greatly improved, with only a minor performance loss. Unlike methods that rely solely on post hoc class activation maps to explain the final decision, the proposed method exposes the dynamic evolution of intrinsic feature maps throughout the backbone, thereby explaining and diagnosing the model's decision by tracing, from input to final detection, the features that are used, missed, or misused.

An Explainable Convolutional Neural Network Using Human-Attention-Guided Feature Extraction for Earthquake Damage Detection in Wooden Houses

Key Points

Abstract

Cite This Study