March 18, 2024Open Access

Poisoning-Free Defense Against Black-Box Model Extraction

Key Points

Key points are not available for this paper at this time.

Abstract

Recent research has shown that an adversary can use a surrogate model to steal the functionality of a target deep learning model even under the black-box condition and without data curation, while the existing defense mainly relies on API poisoning to disturb the surrogate training. Unfortunately, due to poisoning, the defense is achieved at the price of fidelity loss, sacrificing the interests of honest users. To solve this problem, we propose an Adversarial Fine-Tuning (AdvFT) framework, incorporating the generative adversarial network (GAN) structure that disturbs the feature representations of out-of-distribution (OOD) queries while preserving those of in-distribution (ID) ones, circumventing the need for OOD sample collection and API poisoning. Extensive experiments verify the effectiveness of the proposed framework. Code is available at github.com/Hatins/AdvFT.

Read Full Paperexternally

Ask AI

Mark Helpful

Bookmark

Relay

View Full Paper