LEAT: Towards robust deepfake disruption in real-world scenarios via Latent Ensemble Attack
Citations

WEB OF SCIENCE

0

초록

Deepfakes, malicious visual contents created by generative models, pose a serious societal threat. To proactively mitigate deepfake damages, recent studies have employed adversarial perturbation to disrupt deepfake models. However, previous approaches primarily focus on generating distorted outputs based on only predetermined target attributes, lacking robustness in real-world scenarios with unknown attributes. Additionally, the transferability of perturbations between Generative Adversarial Networks (GANs) and Diffusion Models remains unexplored. In this study, we emphasize target attribute and model transferability for effective deepfake disruption. We propose a disruption method, Latent Ensemble ATtack (LEAT). By disrupting the latent, it generates distorted output images, regardless of the given target attributes. This target attribute agnostic attack ensures robust disruption even with unknown attributes. Additionally, we introduce a Normalized Gradient Ensemble strategy that effectively aggregates gradients of target models for iterative gradient attacks, enabling simultaneous attacks on various types of deepfake models, involving both GAN-based and Diffusion-based models. Moreover, we demonstrate the insufficiency of evaluating disruption quality solely based on pixel-level differences. As a result, we propose an alternative protocol for comprehensively evaluating the success of defense. Extensive experiments confirm the efficacy of our method in disrupting deepfakes in real-world scenarios including white-box, gray-box, and black-box.

제목
LEAT: Towards robust deepfake disruption in real-world scenarios via Latent Ensemble Attack
저자
심준교Yoon Hyunsoo
DOI
10.1016/j.eswa.2025.127417
발행일
2025-06
저널명
Expert Systems with Applications
279