DOI of the published article https://doi.org/10.36040/jati.v10i1.17105
Web-Based Application Design to Differentiate AI-Generated Videos from Original Videos Using Transformer Method
Perancangan Aplikasi Berbasis Web untuk Membedakan Video yang Dihasilkan oleh (AI) dengan Video Asli Menggunakan Metode Transformer
DOI:
https://doi.org/10.21070/ups.10422Keywords:
deepfake, Detection, deep learning, ConvNeXt, Vision TransformerAbstract
This research proposes a deepfake detection model based on a hybrid architecture that combines ConvNeXt and Vision Transformer (ViT). ConvNeXt is used to extract local features of facial images, while Vision Transformer is used to capture global context more comprehensively. Furthermore, the model is equipped with two reconstruction paths, namely Autoencoder (AE) and Variational Autoencoder (VAE), to increase the model's sensitivity in detecting subtle visual manipulation artifacts.
The dataset used is derived from publicly available authentic and deepfake videos, with approximately one million extracted facial image frames. The data is divided into training, validation, and testing sets. The experimental results show that the model is able to achieve an accuracy level of up to 98% on the test data, as well as a high F1-Score value, for the dfdc (99.1), ff++ (95.5), Timit (98.3), Celeb DF v2 (91.6) datasets they averaged to 96.125
Downloads
References
R. A. Prawiratama, ‘Design of a Generative AI Image Similarity Test Application and Handmade Images Using Deep Learning Methods’, Telematika, vol. 20, no. 3, p. 326, Nov. 2023, doi: 10.31315/telematika.v20i3.10096.
M. A. I. H. Khusna and S. Pangestuti, ‘DEEPFAKE, TANTANGAN BARU UNTUK NETIZEN (DEEPFAKE, A NEW CHALLENGE FOR NETIZEN)’, PROMEDIA (PUBLIC RELATION DAN MEDIA KOMUNIKASI), vol. 5, Jun. 2019, doi: 10.52447/promedia.v5i2.2300.
Y. Arif Fernandes and Y. Fatma, ‘METODE DEEP LEARNING DALAM TEKNOLOGI DEEPFAKE : SYSTEMATIC LITERATURE REVIEW’, JATI (Jurnal Mahasiswa Teknik Informatika), vol. 9, no. 2, pp. 3403–3410, Apr. 2025, doi: 10.36040/jati.v9i2.12987.
I. Leliana, G. Irhamdhika, A. Haikal, R. Septian, and E. Kusnadi, ‘ETIKA DALAM ERA DEEPFAKE: BAGAIMANA MENJAGA INTEGRITAS KOMUNIKASI’, Jurnal Visi Komunikasi, vol. 22, no. 02, p. 234, Jan. 2024, doi: 10.22441/visikom.v22i02.24229.
D. Putra, S. Sania, and A. Mitrin, ‘Pengaruh Deepfake terhadap Kredibilitas Media Tradisional: Tantangan dan Implikasi di Era Digital’, Sagara Komunika, vol. 1, pp. 13–18, 2024, doi: 10.25311/sagara/Vol1.Iss1.2022.
Y. Hao, L. Dong, F. Wei, and K. Xu, ‘Self-Attention Attribution: Interpreting Information Interactions Inside Transformer’, Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, no. 14, pp. 12963–12971, May 2021, doi: 10.1609/aaai.v35i14.17533.
J. Feng, H. Tan, W. Li, and M. Xie, ‘Conv2NeXt: Reconsidering Conv NeXt Network Design for Image Recognition’, in 2022 International Conference on Computers and Artificial Intelligence Technologies (CAIT), IEEE, Nov. 2022, pp. 53–60. doi: 10.1109/CAIT56099.2022.10072172.
K. Han et al., ‘A Survey on Vision Transformer’, IEEE Trans Pattern Anal Mach Intell, vol. 45, no. 1, pp. 87–110, Jan. 2023, doi: 10.1109/TPAMI.2022.3152247.
T. Raharjo et al., ‘ANALISIS FORENSIK DEEPFAKE BERBASIS CONVOLUTIONAL NEURAL NETWORK (CNN) UNTUK DETEKSI INKONSISTENSI TEKSTUR DAN POLA PADA CITRA WAJAH’, JATI (Jurnal Mahasiswa Teknik Informatika), vol. 9, no. 2, pp. 2731–2738, Mar. 2025, doi: 10.36040/jati.v9i2.13058.
J. Mu, M. Adrezo, and A. N. Haikal, ‘Identifikasi Wajah Asli dan Buatan Deepfake Menggunakan Metode Convolutional Neural Network’, Teknika, vol. 13, no. 1, pp. 45–50, Jan. 2024, doi: 10.34148/teknika.v13i1.705.
Wawan Kurniawan, A. Kurniasih, and Muhamad Abdul Ghani, ‘Real or Deepfake Face Detection in Images and Video Data using YOLO11 Algorithm’, Journal of Artificial Intelligence and Engineering Applications (JAIEA), vol. 4, no. 2, pp. 1514–1521, Feb. 2025, doi: 10.59934/jaiea.v4i2.939.
M. I. Abidin, I. Nurtanio, and A. Achmad, ‘Deepfake Detection in Videos Using Long Short-Term Memory and CNN ResNext’, ILKOM Jurnal Ilmiah, vol. 14, no. 3, pp. 178–185, Dec. 2022, doi: 10.33096/ilkom.v14i3.1254.178-185.
C. P. Prasetia, ‘JITE (Journal of Informatics and Telecommunication Engineering) Efficient Real and Fake Face detection Using ResNet18’, JITE, vol. 4, no. 2, 2025, doi: 10.31289/jite.v9i1.15128.
M. Patrick, C. Lubis, ) Agus, and B. Dharmawan, ‘Jurnal Ilmu Komputer dan Sistem Informasi PENDETEKSIAN CITRA DEEPFAKE WAJAH DI SMARTPHONE MENGGUNAKAN MOBILENETV3-SMALL DAN LBP’.
C. Qin, L. Chen, Z. Cai, M. Liu, and L. Jin, ‘Long short-term memory with activation on gradient’, Neural Networks, vol. 164, pp. 135–145, Jul. 2023, doi: 10.1016/j.neunet.2023.04.026.
X. Liu, L. Hu, L. Tie, L. Jun, X. Wang, and X. Liu, ‘Integration of Convolutional Neural Network and Vision Transformer for gesture recognition using sEMG’, Biomed Signal Process Control, vol. 98, p. 106686, Dec. 2024, doi: 10.1016/j.bspc.2024.106686.
Y. Liu et al., ‘Generative artificial intelligence and its applications in materials science: Current situation and future perspectives’, Journal of Materiomics, vol. 9, no. 4, pp. 798–816, Jul. 2023, doi: 10.1016/j.jmat.2023.05.001.
Y. Li, N. Miao, L. Ma, F. Shuang, and X. Huang, ‘Transformer for object detection: Review and benchmark’, Eng Appl Artif Intell, vol. 126, p. 107021, Nov. 2023, doi: 10.1016/j.engappai.2023.107021.
L. Gao, J. Zhang, C. Yang, and Y. Zhou, ‘Cas-VSwin transformer: A variant swin transformer for surface-defect detection’, Comput Ind, vol. 140, p. 103689, Sep. 2022, doi: 10.1016/j.compind.2022.103689.
Downloads
Additional Files
Posted
License
Copyright (c) 2026 UMSIDA Preprints Server

This work is licensed under a Creative Commons Attribution 4.0 International License.
