D4: Detection of Adversarial Diffusion Deepfakes Using Disjoint Ensembles

Hooda, Ashish; Mangaokar, Neal; Feng, Ryan; Fawaz, Kassem; Jha, Somesh; Prakash, Atul

Computer Science > Machine Learning

arXiv:2202.05687 (cs)

[Submitted on 11 Feb 2022 (v1), last revised 6 Aug 2023 (this version, v3)]

Title:D4: Detection of Adversarial Diffusion Deepfakes Using Disjoint Ensembles

Authors:Ashish Hooda, Neal Mangaokar, Ryan Feng, Kassem Fawaz, Somesh Jha, Atul Prakash

View PDF

Abstract:Detecting diffusion-generated deepfake images remains an open problem. Current detection methods fail against an adversary who adds imperceptible adversarial perturbations to the deepfake to evade detection. In this work, we propose Disjoint Diffusion Deepfake Detection (D4), a deepfake detector designed to improve black-box adversarial robustness beyond de facto solutions such as adversarial training. D4 uses an ensemble of models over disjoint subsets of the frequency spectrum to significantly improve adversarial robustness. Our key insight is to leverage a redundancy in the frequency domain and apply a saliency partitioning technique to disjointly distribute frequency components across multiple models. We formally prove that these disjoint ensembles lead to a reduction in the dimensionality of the input subspace where adversarial deepfakes lie, thereby making adversarial deepfakes harder to find for black-box attacks. We then empirically validate the D4 method against several black-box attacks and find that D4 significantly outperforms existing state-of-the-art defenses applied to diffusion-generated deepfake detection. We also demonstrate that D4 provides robustness against adversarial deepfakes from unseen data distributions as well as unseen generative techniques.

Subjects:	Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2202.05687 [cs.LG]
	(or arXiv:2202.05687v3 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2202.05687

Submission history

From: Neal Mangaokar [view email]
[v1] Fri, 11 Feb 2022 15:21:11 UTC (85 KB)
[v2] Wed, 5 Oct 2022 03:37:06 UTC (1,853 KB)
[v3] Sun, 6 Aug 2023 03:22:53 UTC (5,030 KB)

Computer Science > Machine Learning

Title:D4: Detection of Adversarial Diffusion Deepfakes Using Disjoint Ensembles

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:D4: Detection of Adversarial Diffusion Deepfakes Using Disjoint Ensembles

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators