Key points are not available for this paper at this time.
Introduction: Skin cancer is one of the most common malignancies worldwide, and early-stage diagnosis remains challenging due to its morphological similarity to benign lesions. Most existing computer-aided diagnostic systems rely on single static images, overlooking temporal information that is critical for distinguishing progressive malignancy. Methods: We propose a novel multi-agent spatiotemporal fusion framework to enhance diagnostic accuracy. The framework consists of three key components: (1) a spatial agent based on a convolutional neural network for high-fidelity static feature extraction; (2) a temporal agent employing gated recurrent units to model longitudinal lesion evolution; and (3) a collaboration agent that dynamically fuses spatial and temporal representations via an attention-based weighting strategy. Results: Experiments on large-scale public dermoscopic datasets showed that our method achieved an accuracy of 94.5%, an F1-score of 93.8%, and an AUC of 0.97-outperforming traditional machine learning models, CNN classifiers, and 3D-CNN baselines. Ablation studies further confirmed the critical contribution of temporal modeling and adaptive fusion, particularly in differentiating early melanoma from atypical nevi. Discussion: This work highlights the potential of spatiotemporal modeling to improve early skin cancer detection and provides a promising direction for AI-assisted diagnosis of other chronic diseases requiring longitudinal monitoring.
Zheng et al. (Thu,) studied this question.