What question did this study set out to answer?

The study aims to enhance watermark techniques to protect deep learning model copyrights against attacks.

April 12, 2026Open Access

Black-Box Watermark Method Based on Vision Reasoning

Key Points

The study aims to enhance watermark techniques to protect deep learning model copyrights against attacks.
Developed a black-box watermark method called WaViR.
Created input-output pairs using original and hash images for watermark triggers.
Trained an image generation model with the trigger set.
Implemented simulated fine-tuning to increase watermark robustness.
Applied vision reasoning for ownership verification using SSIM metric.
WaViR successfully resists ambiguity attacks and fine-tuning attacks.
Verification of ownership is achieved when SSIM exceeds the defined threshold.

Abstract

Model watermark is a technique to protect the deep learning models’ copyright. However, existing watermark methods are vulnerable to watermark attack. In ambiguity attack, attacker can reversely construct the input according to the preset output, and utilize this input-output pair as forged watermark. In fine-tuning attack, attacker can remove watermark by performing fine-tuning operations on model. To overcome these limitations, this paper proposes a black-box watermark method called WaViR (Watermark based on Vision Reasoning). WaViR consists of three modules. In watermark construction, the original image is transformed into hash image by cryptographic hash function. These original and hash image form into input-output pair for watermark trigger set. In watermark embedding, the trigger set is utilized to train the image generation model. Besides, simulated fine1tuning is introduced to improve the robustness of watermark. In watermark verification, vision reasoning is applied for ownership verification. For specific image within the trigger set, if the SSIM between the model’s output image and hash image exceeds the threshold, then verification is successful. Owing to the irreversibility of hash function, attacker cannot reversely construct the input that has hash relation with the preset output. Results show that WaViR can resist ambiguity attack and fine-tuning attack.

Bookmark

View Full Paper

Bookmark

View Full Paper

Black-Box Watermark Method Based on Vision Reasoning

Key Points

Abstract

Cite This Study