March 18, 2024Open Access

Maskmark: Robust Neuralwatermarking for Real and Synthetic Speech

Puntos clave

Los puntos clave no están disponibles para este artículo en este momento.

Resumen

High-quality speech synthesis models may be used to spread misinformation or impersonate voices. Audio watermarking can combat misuse by embedding a traceable signature in generated audio. However, existing audio watermarks typically demonstrate robustness to only a small set of transformations of the watermarked audio. To address this, we propose MaskMark, a neural network-based digital audio watermarking technique optimized for speech. MaskMark embeds a secret key vector in audio via a multiplicative spectrogram mask, allowing the detection of watermarked speech segments even under substantial signal-processing or neural network-based transformations. Comparisons to a state-of-the-art baseline on natural and synthetic speech corpora and a human subjects evaluation demonstrate MaskMark's superior robustness in detecting watermarked speech while maintaining high perceptual transparency.

Leer artículo completoexternamente

Me gusta

Guardar

Ver artículo completo

Cite This Study

O’Reilly et al. (Mon,) studied this question.

synapsesocial.com/papers/68e73894b6db6435876b202f https://doi.org/https://doi.org/10.1109/icassp48485.2024.10447253

Also Consider

Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context:

Me gusta

Guardar

Ver artículo completo