During the “Information Gap” period immediately following a large-scale earthquake, rapidly and quantitatively assessing the damage across a wide area is extremely important for optimizing rescue operations. However, conventional aerial photography and satellite image analysis have struggled to detect “layered collapse or pancake collapse,” where the first-floor collapses while the roof structure remains intact. This study proposes a novel building damage assessment method that estimates remaining building height from monocular camera footage mounted on a mobile vehicle and classifies damage based on structural thresholds. This method integrates the open-vocabulary object detection model Grounding DINO and the monocular depth estimation model Depth Anything to automatically calculate the physical height of buildings (wooden and reinforced concrete) and debris areas within images. A large-scale field test was conducted in four municipalities severely affected by the 2024 Noto Peninsula Earthquake: Suzu City, Wajima City, Nanao City, and Shika Town in Ishikawa Prefecture. The survey subjects included diverse building types, such as wooden and reinforced concrete (RC) structures. Damage distribution ranged from extensive layer collapse (D5) in Suzu and Wajima to scattered damage in Nanao and Shika. Ultimately, only wooden houses were detected as having suffered layer collapse (D5). Analysis using driving footage (211 images, 303 buildings) achieved an F1 score of 0.81 specifically for “D5 (Layer Collapse),” which directly impacts the loss of habitable space. This strongly suggests the potential for automatic extraction using height information from lateral angles. While challenges remain in precisely distinguishing between minor external damage and partial collapse, this method could serve as an effective tool for rapidly screening areas with the highest priority for rescue operations during the initial disaster response phase.
Shiraishi et al. (Sun,) studied this question.