Deepfake Detection via Active Perception: A Deep Q-Learning Framework for Interpretable Visual Forensics
Main Article Content
Abstract
The rapid development of generative models has resulted in the widespread synthesis of realistic deepfake media, and thus raises fundamental threats to digital trust and media verification. The majority of current deepfake detection methods are based on supervised convolutional neural networks (CNNs) and work with global image presentations, which may easily have a weak generalization ability and lack interpretive power. This paper presents a different paradigm for the detection of deepfakes: to cast the problem as a sequential decision and adopt reinforcement learning to solve it. For that purpose, the paper proposes a patch-based Deep Q-Learning (DQN) approach that enables the agent to selectively explore local face regions and detect manipulation artifacts. Unlike the typical feedforward classifiers, this approach is capable of fine-grained spatial exploration as well as making the model interpretable: it highlights regions that are informative for the decision. This research is offered as a pilot study to examine feasibility and interpretability rather than sample size and benchmarking. Experiments were performed on balanced partial deepfake image datasets released for public use (2400 images), and each image was considered as a single RL episode. Experimental results on a practical deepfake dataset show that the proposed approach has good performance with an AUC of 0.92. Remarkably, despite processing only 18.2% pixels on average per image, the proposed approach achieves this level of performance, confirming the efficiency and forensic potential of the proposed active perception framework
Downloads
Article Details
Section

This work is licensed under a Creative Commons Attribution 4.0 International License.