Deepfake detection problem from R

0
70
Deepfake detection problem from R



Deepfake detection problem from R

Introduction

Working with video datasets, significantly with respect to detection of AI-based faux objects, may be very difficult as a consequence of correct body choice and face detection. To strategy this problem from R, one could make use of capabilities provided by OpenCV, magick, and keras.

Our strategy consists of the next consequent steps:

  • learn all of the movies
  • seize and extract pictures from the movies
  • detect faces from the extracted pictures
  • crop the faces
  • construct a picture classification mannequin with Keras

Let’s rapidly introduce the non-deep-learning libraries we’re utilizing. OpenCV is a pc imaginative and prescient library that features:

On the opposite hand, magick is the open-source image-processing library that can assist to learn and extract helpful options from video datasets:

  • Read video recordsdata
  • Extract pictures per second from the video
  • Crop the faces from the photographs

Before we go into an in depth clarification, readers ought to know that there isn’t any must copy-paste code chunks. Because on the finish of the put up one can discover a hyperlink to Google Colab with GPU acceleration. This kernel permits everybody to run and reproduce the identical outcomes.

Data exploration

The dataset that we’re going to analyze is offered by AWS, Facebook, Microsoft, the Partnership on AI’s Media Integrity Steering Committee, and varied teachers.

It incorporates each actual and AI-generated faux movies. The whole dimension is over 470 GB. However, the pattern 4 GB dataset is individually out there.

The movies within the folders are within the format of mp4 and have varied lengths. Our activity is to find out the variety of pictures to seize per second of a video. We normally took 1-3 fps for each video.

Note: Set fps to NULL if you wish to extract all frames.

video = magick::image_read_video("aagfhgtpmv.mp4",fps = 2)
vid_1 = video[[1]]
vid_1 = magick::image_read(vid_1) %>% image_resize('1000x1000')

We noticed simply the primary body. What about the remainder of them?

Looking on the gif one can observe that some fakes are very simple to distinguish, however a small fraction appears fairly reasonable. This is one other problem throughout knowledge preparation.

Face detection

At first, face places have to be decided through bounding packing containers, utilizing OpenCV. Then, magick is used to routinely extract them from all pictures.

# get face location and calculate bounding field
library(opencv)
unconf <- ocv_read('frame_1.jpg')
faces <- ocv_face(unconf)
facemask <- ocv_facemask(unconf)
df = attr(facemask, 'faces')
rectX = (df$x - df$radius) 
rectY = (df$y - df$radius)
x = (df$x + df$radius) 
y = (df$y + df$radius)

# draw with crimson dashed line the field
imh  = image_draw(image_read('frame_1.jpg'))
rect(rectX, rectY, x, y, border = "crimson", 
     lty = "dashed", lwd = 2)
dev.off()

If face places are discovered, then it is rather simple to extract all of them.

edited = image_crop(imh, "49x49+66+34")
edited = image_crop(imh, paste(x-rectX+1,'x',x-rectX+1,'+',rectX, '+',rectY,sep = ''))
edited

Deep studying mannequin

After dataset preparation, it’s time to construct a deep studying mannequin with Keras. We can rapidly place all the photographs into folders and, utilizing picture mills, feed faces to a pre-trained Keras mannequin.

train_dir = 'fakes_reals'
width = 150L
peak = 150L
epochs = 10

train_datagen = image_data_generator(
  rescale = 1/255,
  rotation_range = 40,
  width_shift_range = 0.2,
  height_shift_range = 0.2,
  shear_range = 0.2,
  zoom_range = 0.2,
  horizontal_flip = TRUE,
  fill_mode = "nearest",
  validation_split=0.2
)


train_generator <- flow_images_from_directory(
  train_dir,                  
  train_datagen,             
  target_size = c(width,peak), 
  batch_size = 10,
  class_mode = "binary"
)

# Build the mannequin ---------------------------------------------------------

conv_base <- application_vgg16(
  weights = "imagenet",
  include_top = FALSE,
  input_shape = c(width, peak, 3)
)

mannequin <- keras_model_sequential() %>% 
  conv_base %>% 
  layer_flatten() %>% 
  layer_dense(models = 256, activation = "relu") %>% 
  layer_dense(models = 1, activation = "sigmoid")

mannequin %>% compile(
  loss = "binary_crossentropy",
  optimizer = optimizer_rmsprop(lr = 2e-5),
  metrics = c("accuracy")
)

historical past <- mannequin %>% fit_generator(
  train_generator,
  steps_per_epoch = ceiling(train_generator$samples/train_generator$batch_size),
  epochs = 10
)

Reproduce in a Notebook

Conclusion

This put up exhibits easy methods to do video classification from R. The steps had been:

  • Read movies and extract pictures from the dataset
  • Apply OpenCV to detect faces
  • Extract faces through bounding packing containers
  • Build a deep studying mannequin

However, readers ought to know that the implementation of the next steps could drastically enhance mannequin efficiency:

  • extract the entire frames from the video recordsdata
  • load totally different pre-trained weights, or use totally different pre-trained fashions
  • use one other expertise to detect faces – e.g., “MTCNN face detector”

Feel free to attempt these choices on the Deepfake detection problem and share your ends in the feedback part!

Thanks for studying!

Corrections

If you see errors or wish to recommend adjustments, please create a difficulty on the supply repository.

Reuse

Text and figures are licensed underneath Creative Commons Attribution CC BY 4.0. Source code is accessible at https://github.com/henry090/Deepfake-from-R, until in any other case famous. The figures which have been reused from different sources do not fall underneath this license and will be acknowledged by a notice of their caption: “Figure from …”.

Citation

For attribution, please cite this work as

Abdullayev (2020, Aug. 18). RStudio AI Blog: Deepfake detection problem from R. Retrieved from https://blogs.rstudio.com/tensorflow/posts/2020-08-18-deepfake/

BibTeX quotation

@misc{abdullayev2020deepfake,
  writer = {Abdullayev, Turgut},
  title = {RStudio AI Blog: Deepfake detection problem from R},
  url = {https://blogs.rstudio.com/tensorflow/posts/2020-08-18-deepfake/},
  yr = {2020}
}

LEAVE A REPLY

Please enter your comment!
Please enter your name here