Rapid and reliable disaster severity assessment from social media is difficult due to noisy content, modality imbalance, and limited labeled data. This paper introduces ReliefNet, a real-time multimodal framework for disaster severity classification that jointly analyzes images and text. The system integrates DualEmbedNet, a dual-encoder transformer for textual classification, and DisasterNet, a CNN with channel-wise attention for visual analysis, combined through an accuracy-weighted decision-level fusion mechanism that dynamically adapts to modality reliability at inference time. To mitigate labeling scarcity, we construct a unified dataset of 14,996 image-text pairs by merging CrisisMMD and TSEqD. Unlabeled samples are annotated using unsupervised multimodal K-Means clustering, followed by Grad-CAM guided ROI refinement and manual auditing to reduce label noise. Models are evaluated using an 80/10/10 train-validation-test split with metrics including accuracy, precision, recall, F1-score, calibration error, and robustness under missing-modality conditions. The image-only model achieves an F1-score of 97.66%, while ReliefNet attains 97.00% F1 with superior robustness and stability under noisy or incomplete inputs. Accuracy-weighted fusion improves F1-score by 0.7% over equal-weight fusion and outperforms standard CNN baselines such as ResNet and EfficientNet. All models operate in real time on a single NVIDIA GPU, demonstrating ReliefNet's practical deployability. • ReliefNet: DualEmbedNet, DisasterNet, fusion for disaster severity detection. • Built large dataset (13,853) via CrisisMMD, TSEqD, BERT-ResNet, and clustering. • Achieved 97% F1-score via fusion, outperforming text (95%) and image (97.66%). • Used SHAP, Grad-CAM, LIME for explainability, data quality, and trust in AI. • Real-time web app with auto-scraping and 30× faster inference than baselines. • Validated on real disaster data from news sites, proving wide applicability. • Built a lightweight real-time detection framework for the first time.
Rahman et al. (Thu,) studied this question.