Key points are not available for this paper at this time.
Background/Objectives: Accurate differentiation of benign melanocytic nevi from invasive melanoma in dermato-oncology directly informs biopsy decisions and oncological management. Vision–language models (VLMs) are increasingly explored for image-based skin cancer assessment, but their diagnostic reliability and robustness to adversarial input manipulation remain insufficiently characterized. We evaluated three contemporary VLMs for diagnostic performance and susceptibility to single-word adversarial input manipulation (prompt injection) on dermoscopic images of histopathologically confirmed lesions. Methods: Fifty-two dermoscopic images (26 benign melanocytic nevi, 26 invasive melanomas) were analyzed using Claude Opus 4.7, Gemini 3.1 Pro, and GPT-5.4 under four conditions: an unmodified baseline and three adversarial conditions with a single opposite-of-ground-truth label embedded as a visual overlay, in image metadata, or both. Three independent rounds per image × model × condition yielded 1872 classifications across 52 lesions (independent diagnostic units) and 16,848 structured-output observations in total. Results: Baseline diagnostic accuracy ranged from 58.3% to 62.2%, with asymmetric sensitivity and specificity, including a pronounced benign-labeling bias in one model that missed 22 of 26 invasive melanomas. All adversarial conditions reduced accuracy to near-zero levels (0.0–1.9%; all p < 10−7 after Bonferroni correction). Repeated queries produced identical incorrect outputs in 98–100% of cases (Fleiss’ κ 0.97–1.00). Non-diagnostic outputs remained largely unchanged, and self-reported confidence did not decrease. Conclusions: Contemporary VLMs show limited baseline performance and marked vulnerability to minimal adversarial input in dermoscopic skin cancer assessment. The failure selectively alters the malignancy decision while preserving surrounding outputs and confidence, indicating that, within the conditions evaluated here, these systems do not currently appear suitable for unsupervised clinical use in dermato-oncology in the absence of input-integrity safeguards and qualified human oversight.
Güler et al. (Wed,) studied this question.