Nikhilreddybyreddy
Nikhilreddybyreddy
Nikhilreddybyreddy
Learning
School of Computing
National College of Ireland
I hereby certify that the information contained in this (my submission) is information
pertaining to research I conducted for this project. All information other than my own
contribution will be fully referenced and listed in the relevant bibliography section at the
rear of the project.
ALL internet material must be referenced in the bibliography section. Students are
required to use the Referencing Standard specified in the report template. To use other
author’s written or electronic work is illegal (plagiarism) and may result in disciplinary
action.
Signature:
Attach a completed copy of this sheet to each project (including multiple copies).
Attach a Moodle submission receipt of the online project submission, to
each project (including multiple copies).
You must ensure that you retain a HARD COPY of the project, both for
your own reference and in case a project is lost or mislaid. It is not sufficient to keep
a copy on computer.
Assignments that are submitted to the Programme Coordinator office must be placed
into the assignment box located outside the office.
Date:
Penalty Applied (if applicable):
DeepFake Videos Detection Using Machine Learning
Nikhil Reddy Byreddy
17136563
Abstract
With the advancement of technology and machine learning the feasibility for
people to produce fake videos and spread fake news is becoming more and more
easy especially with the creation of GAN models which are capable of producing
accurate enough fake videos. The main agenda of this project is to help detect these
fake videos from real using machine learning techniques. In this project the data
from faceforencis was considered and frames are generated from videos followed by
application of novel technique in classification by using face detection over videos.
These images are transformed using various parameters and models are compared
against them. Highest accuracy of 95% with AUC value of 0.95 is obtained using
laplace transformed images. Quantum machine learning is also applied and was
able to train the model with 50% less time than classical model.
1 Introduction
Creating a fake video and images is nothing new in today’s era. It has become one of
most discussed topics in digital society. No doubt a simple image or photo can speaks a
lot about a person but an audio or a video recording are more persuasive than images
(chesney and citron; 2019). Imagine a video which shows us the some political discussion
of some renowned person and this video could be manipulated using some tools which
is so much persuasive to make us believe that its real. The increasing rate of the digital
technology can convert the worst nightmare into reality. According to the recent research,
in the last 9 months there is a rise of the fake videos to double the number. Researchers
of a cyber-security company have found out that this year 14,698 deepfake videos was
published online out of which 96% was replacing the faces of the celebrity for sexual
activities(Jones; 2019). Due to the increasing rate of the fake content in digital media
leads to a questionable form whether the image or video we are watching or the audio we
are listening is fake or real.
1
deep learning techniques comes in picture to distinguish between them (pothabattula;
2019). There are already many cases seen where the fake videos and images of celebrities
and political figures have been released by people which creates wrong impression about
them. For creating the realistic images and video, deepfake algorithm requires a huge
number of images and video data to train a model. Videos and images of celebrities
and politicians are easily available in large amount, so they are the main predators of
the deepfake techniques (Nguyen et al.; 2019). According to one of the report by tucker
(2019), even satellite images of earth can be faked to misguide the troops by creating
fake bridges and road. To overcome such unexpected circumstances GANs or Generative
adversarial nets are used to identify the objects and confirm whether its real or fake.
4. Transformation on images
7. Evaluation of models
The rest of the paper is organised in the sequential order described below.The next
section closely examines the previous work done on deepfakes and quantum machine
learning. In the segment 3 of the paper the methodology and the process which gives
a details study about the dataset extraction, pre-processing steps and the data mining
techniques used is discussed. In the section 4 of the paper overall discuss about the
implementation, evaluation process and the final outcome of the project is being shown.
The last chapter 5 completes the paper by giving a overall conclusion and future work
related to it.
2
2 Related Work
This discussion in the this section is related to the prior literature review of the video
classification.The sections are divided into many other subsections that are 1)Details
study about deep fakes 2)Quantum machine learning and its uses 3)Literature survey
on video classification using different machine learning techniques 4) Comparison and
summary
3
The expressions are being collected by an RGB sensor which is transferred online to the
target. The target video are manipulated in a photo-realistic fashion which is visually
not possible to notice the changes (Thies et al.; 2018). The other false face creation is
identity manipulation is changing the identity which replaces the face of one person with
other. Also known as face swapping. Deepfake does these swapping with the help of
deep learning techniques. The graphic based techniques used for the manipulations are
Face2Face and FaceSwap which are popular and the deep learning approaches are the
Deepfakes and the neuralTextures. In this paper it even discusses how to automatically
detect the changes made in a video or images. The advancement in the deep learning
techniques helps to learn the image features with CNN and overcome the detection prob-
lem by the training the network in a supervised manner. With random dimensions and
compressions, a benchmark has been proposed from this four-manipulation process which
was never done before.
Many researchers have experimented different methods for detections of deepfake
videos. In one of the researches done by Parkhi et al. (2015), where the main purpose is
to verify and identify different face alignment and metric learning from the large dataset
which has around two million faces collected from various sources. This is done by CNN
architecture which even filter the unnecessary details. A similar type of research is done
by Schroff et al. (2015), where it speaks about the FaceNet where using deep convolu-
tional network learns the Euclidean embedding per image. The convolutional network is
trained in such a manner that the square L2 distances in the embedded space is directly
proportional to the face similarity i.e. the real face has small distances whether the dif-
ferent faces have large distance. The face is verified by checking the distance between
two different faces which in turn becomes a KNN classification problem. The approach
is verified using different dataset like labelled faces in the wild(LFW) which achieved
99.63% accuracy and youtube faces DB obtained 95.12% accuracy.
One of the research studies done by Korshunov and Marcel (2019) the replacement of
faces is done by pre-trained algorithm known as GAN (Generative adversarial network)
which is popularly known as Faceswap-GAN(Nguyen et al.; 2019). GAN was first pro-
posed by Warde-Farley et al. (2014), where a detailed study of the adversarial process
was done in which two models are being trained. A generative model which holds all the
distributed data and a discriminative model which calculates the probability of samples
generated from the training data. The generative model is trained in such a way so that
it can increase the probability of the discriminative model from committing any mistakes.
So this model leads in generative the deepfake which ultimately effects the quality of the
video.
According to Korshunov and Marcel (2019), the face recognition algorithms like VGG
and Facenet neural network does not work properly on deepfake videos and fails to distin-
guish between original and tampered video with a 95% of error rate. The lip-sync based
algorithms even failed in detetcting the mismatch between speech and the lip movement.
But, the image based approach with the SVM classifier is figured out to give 8.97% er-
ror rate on deepfake videos. It is therefore concluded that image-based approaches have
higher percentage of accuracy in recognizing the deepfake videos than rest of the meth-
ods. Still it is challenging for advanced face swapping techniques.
DeepFake are even generated in the field of virtual reality. Both the end i.e. the target
and the source are fed into the generator network as two different input which remake
the original image. An additional discriminator network is used in a GAN to separate
the real input from the forged output.Source and target are trained separately via gen-
4
erator network which includes both the encoder-decoder module. After this process the
virtual fakes can be easily be generated by using source images as input in the target
generator.(Bose and Aarabi; 2019)
In 1985 on the basis of the Feynman’s ideas Deutsh(Deutsch; 1985) proposed his
Quantum Turing Machine. Basically Deutsch explained the technique of quantum paral-
lelism which in turn used the principle of superposition from quantum mechanics. Using
this superposition principle the Turing machine was able to encode a huge number of
inputs in less memory and also can perform the calculations simultaneously on all the
inputs. After Deutsch in 1994 Shor(Shor; 1994) made a advanced effort in quantum com-
puting which explored the power of quantum parallelism and created a polynomial time
algorithm for prime factorisation which took exponential less time compared to classical
computers. Next in 1996 Grover (Grover; 1996) found the quantum algorithm to find a
single stored element from unsorted database in a square root of time of that taken by
classical computers. From here the quantum computing became the most exciting field
of science.
In the following section lets see the basic operation of Quantum computing which
makes it faster and efficient. To understand this will consider the Deutsch–Jozsa al-
gorithm (James; 2001)he basic unit in Quantum computing is called as Quantum bits
which is denoted as ’qubit’. Qubit has the state of both 0 and 1 at the same time for
e.g. the horizontal and vertical polarisation of photon[]. Mathematically the qubits are
represented by a unit vector , in the Dirac notation as follows,
where |0i and |1i two basis states, and α0 and α1 are complex numbers with
The states |0i and |1i are called computational basis states of qubits and they correspond
to the two states 0 and 1 of classical bits. The number α0 and α1 are called probability
amplitudes of the state |Ψ|. The main and advantageous difference between classical and
quantum is that, the qubits can be superposition of both |0iand|1ias in Eq 1.To perform
mathematical operations we need to know the state at any point of time just like classical
computer to enable this α and β are subjected to normalization that is the square and
sum of these amplitudes at any point of time is equal to 1 i.e, probability of 1(Debnath
et al.; 2016).An example state of qubit is:
1
|Ψ| = √ (|0i − |1i) (3)
2
5
As the qbits can have both a 0 and 1 at a same time two qbits can represent 4 states and
three qbits can represent 8 states. So by this nature the qbits can perform faster and
calculate more efficiently.
The two concepts which makes the quantum computing faster are Superposition and en-
tanglement.
Superposition is the concept where qubits can represent two states at same time that is
state 0 and 1. Both states are superposed which helps in representing more values with
the less variables , which makes the calculations faster than normal theory of 0/1(Ying;
2010).
Entanglement is the major and crucial concept of quantum computing. This is the state
where the two qubits can not be represented by the vector product of two qubits. This
state gives another information or the value which can also represent a value rather than
normal multiplied product, like this the physical resources in quantum computing are
used extensively and information processing is made faster.(Ying; 2010)
In 2019 the authors Fastovets et al. (2019) conducted the experiments to investing on
machine learning methods in quantum computing theory. They used the IBM’s quantum
processor. According the authors quantum machine learning is the approach which is
hybrid approach including the classical and quantum algorithms. Quantum approaches
are used to analyse the quantum state while, the quantum algorithms will improve the
efficiency of the classical data science algorithms exponentially. In this paper they imple-
mented the classical K means algorithm using the Quantum minimisation algorithm and
SWAP test (Mengoni and Di Pierro; 2019) Also implemented the tree tensor networks
which help in implementing quantum machine learning algorithms. After the implement-
ation they found that the algorithms with quantum computing worked exponentially
faster than the normal ones. Hence the usage of quantum machine learning is the future
of machine learning to solve the extensively large data set and predict the answers
6
The LSTM unit acts as an intermediate layer which is being trained without any auxiliary
loss function. A suspected video is seen to give a 94% more accuracy in detecting the
manipulation of videos than other detector.
The firmness of the pattern in any kind of data can be checked by machine learning
techniques which helps us to know whether the data have been manipulated or not.
But for determining it a good and large number of datasets is required so the video
tampering is not seen properly in small dataset (Johnston et al.; 2019).A self-supervised
method which uses Siamese neural network which detect whether the pixel patches have
same image metadata, or same image pipeline or only have a part of the original image.
An auto recorded EXIF meta-data of the real images was trained to check whether the
image has been tampered or its still the original one i.e the content of the image is from
single imaging pipeline or its different. Three processing techniques has been used i.e. re-
scaling, Gaussian blur and JPEG compression and 8- features from EXIP metadata was
generated. This was a new research done in the field of image tampering localisation(Liu
et al.; 2018).
Due to the advancement of the AI, the fake digital data have rose to great heights. It has
the capability to change the actual content of the video, audio or image and give a false
impression. So, it is very necessary to trace back and find out the provenance of digital
media. If the proof of authenticity (PoA) of the digital content is found out then it will be
easier to remove the suspicious content. According to the research done by (Hasan and
Salah; 2019), a novel method (blockchain based solution)was introduced which provide
a framework using Ethereum smart contracts to track and find out the original source
even if the content was tampered multiple times. It uses the hashes of interplanetary file
system(IPFS) which stores the digital content. If the origin of the source is trustworthy
and reputable then the content can be proved to be real and legitimate.
3 Methodology
3.1 Introduction
As this project in a basic sense falls under the section of data mining and data science,
one of the mostly used methodology in this category KDD is chosen for this project.
7
The reason for choosing KDD over CRISP DM is because usually CRISP DM ends with
deployment of the project as it is designed to suit the business applications and in KDD
this is not a mandate step to finish which really suits for this project. In the following
sections you can find the details about methodology used and also about the design
process used to implement this project.
4 Design Specification
For this project a 2 tier design process is implemented. From 2 the 2 tiers are named as
1) Presentation Layer and 2) Business Logic Layer.
8
Figure 2: Project Design process Of DeepFake Videos Detection
5 Dataset
The dataset chosen for this project is FaceForensics++ 1 FaceForensics++ is a forensics
dataset contains over 1000 videos sourced from youtube and few videos are made by
hiring actors. But for the purpose of the project and due to the computational expenses
only 100 videos each from original and deep fake category is used.
The reason for choosing only 100 videos is mainly because of computational problems
with the limited usage access in Colab.
9
6 Implementation
In this section a thorough explanation on the extraction of frames, face detection, trans-
formation on images feature extraction is discussed. All models are evaluated and com-
pared using AUC score / ROC curve data and also compared with the models from the
literature review. Quantum machine learning model is also compared against the best
model and results are outlined in this section.
10
Figure 4: Face Detection And Extraction
11
in different scenarios. This is because mathematically Laplace transform is obtained by
second derivative of the image where as canny edge is obtained on first derivative of an
image. From figure 5 and figure 6 it can be seen that there are few differences in the
images obtained if closely observed.
12
6.6 Extraction of arrays from images
This step is mandatory because the machine learning models/algorithms won’t be able
to process unstructured data like images with out any transformation. To enable this the
image processing technique of extraction of arrays from digital image is done by using
computer vision library called CV2 in python.
This is implemented on the following sets of images as listed below and there by
resulting in 5 different datasets
1. Raw images
13
Figure 8: Accuracy With Raw Images - Overfit
14
Figure 10: Accuracy With Face Images
From the above figures it can be concluded that it resulted in good results as the bench-
mark literature review paper Rössler, Cozzolino, Verdoliva, Riess, Thies and Nießner
(2019) had accuracy of 85% and also the novel technique of face detection in classifying
images overcame the problem of overfitting.
15
6.10 Evaluation and Results of CNN on face images with laplace
transform
The data set with array of face images with laplace transformation is used here. There
are 742 images and train and test datasets are formed in such a way that 80% of training
and 20% of testing sets respectively i.e 593 of train and 149 of test data. CNN is used to
classify images and it resulted in good accuracy of 91% figure 12 and AUC 13 of 0.96.
From the above figures it can be concluded that it resulted in good results as the bench-
16
mark literature review paper Rössler, Cozzolino, Verdoliva, Riess, Thies and Nießner
(2019) had accuracy of 85% and also it have beaten the AUC value of 0.95 using face
images. Altough the accuracy is less here by 1% due to a point increase in AUC value
this model is determined as best model.
17
Figure 15: ROC With Face Images Canny Edge
These are interesting results when compared to laplace transformed dataset because
in visualization there are not big differences between the images but the results are
different this shows that there is clear logical and operational difference between both the
transforms.
18
Figure 16: Accuracy With Face Images ADM
19
50% faster than the GPU implementation.
There by the research and sub research question has been answered.
References
Abu-El-Haija, S., Kothari, N., Lee, J., Natsev, P., Toderici, G., Varadarajan, B. and Vi-
jayanarasimhan, S. (2016). Youtube-8m: A large-scale video classification benchmark,
CoRR abs/1609.08675.
URL: http://arxiv.org/abs/1609.08675
Azevedo, A. I. R. L. and Santos, M. F. (2008). Kdd, semma and crisp-dm: a parallel
overview, IADS-DM .
Bloomberg (2018). how faking videos became easy — and why that’s so scary2 018.
URL:https://fortune.com/2018/09/11/deep-fakes-obama-video/
Bose, A. J. and Aarabi, P. (2019). Virtual fakes: Deepfakes for virtual reality, 2019 IEEE
21st International Workshop on Multimedia Signal Processing (MMSP), IEEE, pp. 1–1.
chesney, r. and citron, d. (2019). Deepfakes and the new disinformation war.
Debnath, S., Linke, N. M., Figgatt, C., Landsman, K. A., Wright, K. and Monroe, C.
(2016). Demonstration of a small programmable quantum computer with atomic qubits,
Nature 536(7614): 63–66.
Deepfakes github (2018).
URL: https://github.com/deepfakes/faceswap
Deutsch, D. (1985). Quantum theory, the church-turing principle and the universal quantum
computer, Proceedings of the Royal Society A: Mathematical, Physical and Engineering
Sciences 400(1818): 97–117.
Fastovets, D. V., Bogdanov, Y. I., Bantysh, B. I. and Lukichev, V. F. (2019). Machine
learning methods in quantum computing theory, Cornell University .
Feynman, R. P. (1982). Simulating physics with computers, International Journal of The-
oretical Physics 21(6-7): 467–488.
Grover, L. (1996). L.k. grover, a fast quantum mechanical algorithm for database search,,
Proceedings of 28th Annual ACM Symposium on the Theory of Computing, 1996 .
20
Guera, D. and Delp, E. J. (2019). Deepfake video detection using recurrent neural networks,
Video and Image Processing Laboratory (VIPER) .
Hasan, H. R. and Salah, K. (2019). Combating deepfake videos using blockchain and smart
contracts, IEEE Access 7: 41596–41606.
hui, j. (2018). How deep learning fakes videos (deepfake) and how to detect it?
Johnston, P., Elyan, E. and Jayne, C. (2019). Video tampering localisation using features
learned from authentic content, Neural Computing and Applications .
Liu, A., Huh, M., Owens, A. and Efros, A. A. (2018). Fighting fake news: Image splice
detection via learned self-consistency, The European Conference on Computer Vision
(ECCV) pp. 101–117.
Nguyen, T. T., Nguyen, C. M., Nguyen, D. T., Nguyen, D. T. and Nahavandi, S. (2019).
Deep learning for deepfakes creation and detection.
Ramprasath, M., Anand, M. and Hariharan, S. (2018). Image classification using convolu-
tional neural networks, International Journal of Pure and Applied Mathematics 119.
Rossler, A., Cozzolino, D., Nießner, M., Verdoliva, L., Riess, C. and Thies, J. (2019). Face
forensics++: Learning to detect manipulated facial images.
Rössler, A., Cozzolino, D., Verdoliva, L., Riess, C., Thies, J. and Nießner, M.
(2019). Faceforensics++: Learning to detect manipulated facial images, arXiv preprint
arXiv:1901.08971 .
Schroff, F., Kalenichenko, D. and Philbin, J. (2015). Facenet: A unified embedding for
face recognition and clustering, 2015 IEEE Conference on Computer Vision and Pattern
Recognition (CVPR), pp. 815–823.
Shor, P. W. (1994). Algorithms for quantum computation: Discrete logarithms and factor-
ing,, Symposium on Foundations ofComputer Science, IEEE Press, Los Alamitos .
21
Singh, V. (2019). Image forgery detection.
URL: https://towardsdatascience.com/image-forgery-detection-2ee6f1a65442
Thies, J., Zollhöfer, M., Stamminger, M., Theobalt, C. and Nießner, M. (2018). Face2face:
Real-time face capture and reenactment of rgb videos.
tucker, p. (2019). The newest ai-enabled weapon: ‘deep-faking’ photos of the earth.
URL: https://www.defenseone.com/technology/2019/03/next-phase-ai-deep-faking-
whole-world-and-china-ahead/155944/
Warde-Farley, D., Ozairt, S., Goodfellow, I. J., Pouget-Abadie, J., Mirza, M., Xu, B.,
Courville, A. and Bengio, Y. (2014). Generative adversarial nets, 1.
Ying, M. (2010). Quantum computation, quantum theory and ai, Artificial Intelligence
174(2): 162–176.
22