ENSEMBLE TECHNIQUES FOR DEEPFAKE DETECTION: COMPARING STACKING, WEIGHTED VOTING AND AVERAGING APPROACHES

ABSTRACT

Over the years, deepfake detection has been an emerging challenge because of the increase in highly realistic computer-generated synthetic media. Although the study examined deep learning and conventional machine learning models separately, the ideal ensemble technique for integrating CNN-extracted features with SVM and XGBoost classifiers is relatively obscure. Our study shows the comparative analysis of three ensemble methods: stacking, weighted voting, and averaging applied to a hybrid model consisting of a pretrained CNN from scratch, SVM, and XGBoost trained through transfer learning. A 140,000-image dataset openly accessible on Kaggle was used for the research, which shows that stacking surpasses both weighted voting and averaging, with a high accuracy of 95.27% in contrast to weighted voting, 95% and averaging 95.09%. Statistical tests validate these differences. The results of the experiment show that stacking is the optimal ensemble method when CNN, SVM and XGBoost are integrated for deepfakes image detection.

Keywords: Deepfakes, SVM, CNN, Deep Learning, Ensemble methods, XGBoost.