You will need two sets of training images. Set A will be the source or original video face. We will need 500-2000 images.Set B is the desired / actor / fake face. Also 500-200 images. Ideally, the faces are already pretty similar. The easiest swap would have both faces looking straight ahead the entire time. If this is not the case, you will need to provide diverse angles and lighting to train the model how to predict all conditions. Extract image frames from segments of video clips (PotPlayer) to generate images with a variety of angles and facial expressions. Avoid images where the faces is covered by hair, glasses, hands, or other obstacles. Rremove all images with more than one face. Alternatively, you can edit images with more than one face by deleting the other faces that you do not want.Resize all of your images to 1280×720 or smaller. Later, if you run into memory problems during extraction, you can reduce the images to 900×900 or smaller. Y
Drag video into PotPlayer. Find the start of the video segment. Pause. Ctrl+G to open the frame extraction window. Select the destination directory in the area marked “1” in the image. In the area marked “2”, you can choose the naming convention. (However, note that the filenames will not be perfectly sequential starting from 0 or 1, regardless of what method you choose. That is why I recommend FFmpeg for the target video conversion, and to use PotPlayer for extracting training frames quickly from desired video locations.) In the area marked “3”, you can choose the image type. I recommend JPG at 100% quality, as PNGs cause errors for some users of FakeApp. You can pick the image size in the area marked “4” so that you do not need to readjust the size again. In the area marked “5” you can select the frame capture rate. The settings are shown for extracting every frame. If you want to extract every tenth frame, you could enter “10” instead. If you want to extract 1 frame every 100 ms, you can choose the other radio button option, or adjust the ms interval accordingly. You also set the total number of frames to extract here, shown as 1000. Finally, hit the Start button in the area marked “6”. Unpause the video. Frames will automatically be extracted according to the parameters you set. When you want to stop, pause the video again. Open the destination folder, and you should see our desired frames.
Usually, you will have a particular video clip for Face A that you would like to replace. Make sure to include images from this clip in your training set. Some choose to train the computer model on a variety of general face images before training on the particular clip of choice. For your first attempt, you should simply extract all of the frames from your target clip and use that as the training data for Face A. You should start with a shorter clip of perhaps 10 seconds in length. If you installed FFmpeg and updated the PATH , you can extract every single frame to replace from this video clip using FFmpeg.
Drag the files you with to resize into the area marked “1” in the image, or you can choose the folder or images with the buttons. Change the output file format to jpg in area marked “2”. Change the max dimension to “Largest” and “900” in the area marked “3”. You could try adjusting these settings as well. In the area marked “4”, you may wish to add a prefix such as “resize” to distinguish your resized images. Enter the destination in the area marked “5”. Make sure that the boxes are unchecked in area “6” or you will have extra objects in your images. Finally, click the Start button in the area marked “7” to process your images.
Copy all of your resized images for Face A into the directory C:\fakes\data_A. This directory should already be created if you installed FakeApp. Make sure there are no other images in this directory.NOTE: It is highly recommended that you close all other programs when running FakeApp.Start FakeApp by running fakeapp.bat. Click on the Extract tab.
In the Data path dialog box marked “1” in the figure above, enter exactly “C:/fakes/data_A/” without the quotes. Note that the slashes are forward instead of the usual Windows backslash. Also note that there is a forward slash at the end. Under the area marked “2”, you can specify jpg or png filetypes, depending on what you used for your training data set. You can try png images, but if you have errors, simply use jpg images instead. Make sure all of your images in the training set are in fact the correct file type. Under the area marked “3” you can specify whether there are multiple faces in the images. I recommend that you keep this to “false” and only use training sets with single faces to start out. In the area marked “4” choose GPU for image processing. Press the Start button.
A new command window will appear. If you see something like this, you successfully extract the faces for your training data. The example is for only 2 images, and the whole process could take 10-30 minutes depending on how many images you have and the specifications of your computer. The initial RuntimeError on the first line is normal and always shows up. Sometimes, the algorithm will not be able to detect faces. It will indicate this in the command window and move on to the next image. This is not a problem if your training data set is large.
If you see a message like this, your face extraction failed because you had an out-of-memory error. You need to reduce the size of your training images and try again. It is best to use the largest size images that your memory can handle, so do not drastically reduce image sizes. You can reduce the image size to about 900×900, or a little higher or a little lower.
Copy all of your resized images for Face B into the directory C:\fakes\data_B. Repeat all of the instructions as above, but replace the dialog box in the Extract tab with the correct path to data_b with forward slashes.
Manually check the extract face images to remove problematic ones. There will be a new folder called “aligned” at C:\fakes\data_A\aligned and C:\fakes\data_B\aligned. Remove any obvious errors in face detection: Remove faces that are partially obscured or that have extreme lighting conditions.