[GUIDE] - DeepFaceLab 2.0 Guide

TMBDF · Feb 5, 2020

Code:

Official repository: / 
Please consider a donation.
 
 
Windows 10 users important notice!
You should set this setting in order to work correctly.
 
System – Display – Graphics settings
 

 
============ CHANGELOG ============
 
== 20.10.2021 ==
 
SAEHD, AMP: random scale increased to -0.15+0.15. Improved lr_dropout capability to reach lower value of the loss.
 
SAEHD: changed algorithm for bg_style_power. Now can better stitch a face losing src-likeness.
 
added option Random hue/saturation/light intensity applied to the src face set only at the input of the neural network. Stabilizes color perturbations during face swapping. Reduces the quality of the color transfer by selecting the closest one in the src faceset. Thus the src faceset must be diverse enough. Typical fine value is 0.05.
 
Liae arhi: when random_warp is off, inter_AB network is no longer trained to keep the face more src-like.
 
 
== 09.10.2021 ==
 
SAEHD: added -t arhi option. Makes the face more src-like.
                                                        
SAEHD, AMP:
 
removed the implicit function of periodically retraining last 16 “high-loss” samples
 
fixed export to .dfm format to work correctly in DirectX12 DeepFaceLive build.
 
In the sample generator, the random scaling was increased from -0.05+0.05 to -0.125+0.125, which improves the generalization of faces.
 
 
== 06.09.2021 ==
 
Fixed error in model saving.
 
AMP, SAEHD: added option ‘blur out mask’
Blurs nearby area outside of applied face mask of training samples.
The result is the background near the face is smoothed and less noticeable on swapped face.
The exact xseg mask in src and dst faceset is required.
 
AMP, SAEHD: Sample processors count are no more limited to 8, thus if you have AMD processor with 16+ cores, increase paging file size.
 
DirectX12 build: update tensorflow-directml to 1.15.5 version.
 
== 12.08.2021 ==
 
XSeg model: improved pretrain option
 
Generic XSeg: added more faces (the faceset is not publicly available) and retrained with pretrain option. The quality is now higher.
 
Updated RTM WF Dataset with the new Generic XSeg mask applied, also added 490 faces with closed eyes.
 
 
== 30.07.2021 ==
 
Export AMP/SAEHD: added "Export quantized" option. (was enabled before)
Makes the exported model faster. If you have problems, disable this option.
 
 
AMP model:
changed help of ct mode:
       Change color distribution of src samples close to dst samples. If src faceset is deverse enough, then lct mode is fine in most cases.
Default inter dims now 1024
return lr_dropout option
last high loss samples behaviour - same as SAEHD
 
XSeg model: added pretrain option.
 
Generic XSeg: retrained with pretrain option. The quality is now higher.
 
Updated RTM WF Dataset with the new Generic XSeg mask applied.
 
== 17.07.2021 ==
 
SAE/AMP: GAN model is reverted to December version, which is better, tested on high-res fakes.
 
AMP:   default morph factor is now 0.5
       Removed eyes_mouth_prio option, enabled permanently.
       Removed masked training, enabled permanently.
 
Added script
6) train AMP SRC-SRC.bat
 
Stable approach to train AMP:
1)  Get fairly diverse src faceset
2)  Set morph factor to 0.5
3)  train AMP SRC-SRC for 500k+ iters (more is better)
4)  delete inter_dst from model files
5)  train as usual
 

 
== 01.07.2021 ==
 
AMP model:   fixed preview history
            
added ‘Inter dimensions’ option. The model is not changed. Should be equal or more than AutoEncoder dimensions.
More dims are better, but require more VRAM. You can fine-tune model size to fit your GPU.
 
Removed pretrain option.
 
Default morph factor is now 0.1
 
How to train AMP:
 
1)  Train as usual src-dst.
2)  Delete inters model files.
3)  Train src-src. It’s mean place src aligned to data_dst
4)  Delete inters model files.
5)  Train as usual src-dst.
 
Added scripts
6) export AMP as dfm.bat
6) export SAEHD as dfm.bat
Export model as .dfm format to work in DeepFaceLive.
 
== 02.06.2021 ==
 
AMP model: added ‘morph_factor’ option. [0.1 .. 0.5]
The smaller the value, the more src-like facial expressions will appear. 
The larger the value, the less space there is to train a large dst faceset in the neural network. 
Typical fine value is 0.33
 
 
AMP model: added ‘pretrain’ mode as in SAEHD
 
Default pretrain dataset is updated with applied Generic XSeg mask
 
 
== 30.05.2021 ==
 
Added new experimental model ‘AMP’ (as amplifier, because dst facial expressions are amplified to src)

 
 
It has controllable ‘morph factor’, you can specify the value (0.0 .. 1.0) in the console before merging process.
 
If the shapes of the faces are different, you will get different jaw line

which requires a hard post process.
 
But you can pretrain a celeb on large dst faceset with applied Generic XSeg mask (included in torrent). Then continue train with dst of the fake.
In this case you will get more ‘sewed’ face.

 
 
And merged face looks fine:

 
 
Large dst WF faceset with applied Generic XSeg mask is now included in torrent file.
If your src faceset is diverse and large enough, then ‘lct’ color transfer mode should be used during pretraining.
 
 
XSegEditor: delete button now moves the face to _trash directory and it has been moved to the right border of the window
 
Faceset packer now asks whether to delete the original files
 
Trainer now saves every 25 min instead of 15
 
 
 
== 12.05.2021 ==
 
FacesetResizer now supports changing face type
XSegEditor: added delete button
Improved training sample augmentation for XSeg trainer.
XSeg model has been changed to work better with large amount of various faces, thus you should retrain existing xseg model.
Added Generic XSeg model pretrained on various faces. It is most suitable for src faceset because it contains clean faces, but also can be applied on dst footage without complex face obstructions.
5.XSeg Generic) data_dst whole_face mask - apply.bat
5.XSeg Generic) data_src whole_face mask - apply.bat
 
== 22.04.2021 ==
 
Added new build DeepFaceLab_DirectX12, works on all devices that support DirectX12 in Windows 10:
 
AMD Radeon R5/R7/R9 2xx series or newer
Intel HD Graphics 5xx or newer
NVIDIA GeForce GTX 9xx series GPU or newer
DirectX12 is 20-80% slower on NVIDIA Cards comparing to ‘NVIDIA’ build.
 
Improved XSeg sample generator in the training process.
 
 
== 23.03.2021 ==
 
SAEHD: random_flip option is replaced with
 
random_src_flip (default OFF)
 
Random horizontal flip SRC faceset. Covers more angles, but the face maylook less naturally
 
random_dst_flip (default ON)
 
Random horizontal flip DST faceset. Makes generalization of src->dst better, if src random flip is not enabled.
 
 
Added faceset resize tool via
 
4.2) data_src util faceset resize.bat
5.2) data_dst util faceset resize.bat
 
Resize faceset to match model resolution to reduce CPU load during training.
Don’t forget to keep original faceset.
 
 
== 04.01.2021 ==
 
SAEHD: GAN is improved. Now produces less artifacts and more cleaner preview.
 
All GAN options:
 
GAN power
Forces the neural network to learn small details of the face. 
Enable it only when the face is trained enough with lr_dropout(on) and random_warp(off), and don't disable. 
The higher the value, the higher the chances of artifacts. Typical fine value is 0.1
 
GAN patch size (3-640)
The higher patch size, the higher the quality, the more VRAM is required. 
You can get sharper edges even at the lowest setting. 
Typical fine value is resolution / 8.
 
GAN dimensions (4-64)
The dimensions of the GAN network. 
The higher dimensions, the more VRAM is required. 
You can get sharper edges even at the lowest setting. 
Typical fine value is 16.
 
Comparison of different settings:
 

 
 
== 01.01.2021 ==
 
Build for “2080TI and earlier” now exists again.
 
== 22.12.2020 ==
 
The load time of training data has been reduced significantly.
 
== 20.12.2020 ==
 
SAEHD:
 
lr_dropout now can be used with AdaBelief
 
Eyes priority is replaced with Eyes and mouth priority
Helps to fix eye problems during training like "alien eyes" and wrong eyes direction. 
Also makes the detail of the teeth higher.
 
New default values with new model:
Archi : ‘liae-ud’
AdaBelief : enabled
 
== 18.12.2020 ==
 
Now single build for all video cards.
                                
Upgraded to Tensorflow 2.4.0, CUDA 11.2, CuDNN 8.0.5.
You don’t need to install anything.
 
== 11.12.2020 ==
 
Upgrade to Tensorflow 2.4.0rc4
 
Now support RTX 3000 series.
 
Videocards with Compute Capability 3.0 are no longer supported.
 
CPUs without AVX are no longer supported.
 
SAEHD: added new option
Use AdaBelief optimizer?
Experimental AdaBelief optimizer. It requires more VRAM, but the accuracy of the model is higher, and lr_dropout is not needed.
 
 
== 02.08.2020 ==
 
SAEHD: now random_warp is disabled for pretraining mode by default
Merger: fix load time of xseg if it has no model files
 
== 18.07.2020 ==
 
Fixes
 
SAEHD: write_preview_history now works faster
The frequency at which the preview is saved now depends on the resolution.
For example 64x64 – every 10 iters. 448x448 – every 70 iters.
 
Merger: added option “Number of workers?”
Specify the number of threads to process. 
A low value may affect performance. 
A high value may result in memory error. 
The value may not be greater than CPU cores.
 
 
== 17.07.2020 ==
 
SAEHD:
 
Pretrain dataset is replaced with high quality FFHQ dataset.
 
Changed help for “Learning rate dropout” option:
When the face is trained enough, you can enable this option to get extra sharpness and reduce subpixel shake for less amount of iterations. 
Enabled it before “disable random warp” and before GAN. n disabled. y enabled
cpu enabled on CPU. This allows not to use extra VRAM, sacrificing 20% time of iteration.
 
Changed help for GAN option:
Train the network in Generative Adversarial manner. 
Forces the neural network to learn small details of the face. 
Enable it only when the face is trained enough and don't disable. 
Typical value is 0.1
 
improved GAN. Now it produces better skin detail, less patterned aggressive artifacts, works faster.

 
== 04.07.2020 ==
 
Fix bugs.
Renamed some 5.XSeg) scripts.
Changed help for GAN_power.
 
== 27.06.2020 ==
 
Extractor:
       Extraction now can be continued, but you must specify the same options again.
 
       added ‘Max number of faces from image’ option.
If you extract a src faceset that has frames with a large number of faces, 
it is advisable to set max faces to 3 to speed up extraction.
0 - unlimited
 
added ‘Image size’ option.
The higher image size, the worse face-enhancer works.
Use higher than 512 value only if the source image is sharp enough and the face does not need to be enhanced.
 
added ‘Jpeg quality’ option in range 1-100. The higher jpeg quality the larger the output file size
 
 
Sorter: improved sort by blur and by best faces.
 
== 22.06.2020 ==
 
XSegEditor:
changed hotkey for xseg overlay mask
“overlay xseg mask” now works in polygon mode

 
== 21.06.2020 ==
 
SAEHD:
Resolution for –d archi is now automatically adjusted to be divisible by 32.
‘uniform_yaw’ now always enabled in pretrain mode.
 
Subprocessor now writes an error if it does not start.
 
XSegEditor: fixed incorrect count of labeled images.
 
XNViewMP: dark theme is enabled by default

 
== 19.06.2020 ==
 
SAEHD:
 
Maximum resolution is increased to 640.
 
‘hd’ archi is removed. ‘hd’ was experimental archi created to remove subpixel shake, but ‘lr_dropout’ and ‘disable random warping’ do that better.
 
‘uhd’ is renamed to ‘-u’
dfuhd and liaeuhd will be automatically renamed to df-u and liae-u in existing models.
 
Added new experimental archi (key -d) which doubles the resolution using the same computation cost.
It is mean same configs will be x2 faster, or for example you can set 448 resolution and it will train as 224.
Strongly recommended not to train from scratch and use pretrained models.
 
New archi naming:
'df' keeps more identity-preserved face.
'liae' can fix overly different face shapes.
'-u' increased likeness of the face.
'-d' (experimental) doubling the resolution using the same computation cost
Opts can be mixed (-ud)
Examples: df, liae, df-d, df-ud, liae-ud, ...
 
Not the best example of 448 df-ud trained on 11GB:

 
Improved GAN training (GAN_power option).  It was used for dst model, but actually we don’t need it for dst.
Instead, a second src GAN model with x2 smaller patch size was added, so the overall quality for hi-res models should be higher.
 
Added option ‘Uniform yaw distribution of samples (y/n)’:
       Helps to fix blurry side faces due to small amount of them in the faceset.
 
Quick96:
       Now based on df-ud archi and 20% faster.
 
XSeg trainer:
       Improved sample generator.
Now it randomly adds the background from other samples.
Result is reduced chance of random mask noise on the area outside the face.
Now you can specify ‘batch_size’ in range 2-16.
 
Reduced size of samples with applied XSeg mask. Thus size of packed samples with applied xseg mask is also reduced.
 
 
== 11.06.2020 ==
 
Trainer: fixed "Choose image for the preview history". Now you can switch between subpreviews using 'space' key.
Fixed "Write preview history". Now it writes all subpreviews in separated folders
 

also the last preview saved as _last.jpg before the first file

thus you can easily check the changes with the first file in photo viewer
 
 
XSegEditor: added text label of total labeled images
Changed frame line design
Changed loading frame design
 

 
== 08.06.2020 ==
 
SAEHD: resolution >= 256 now has second dssim loss function
 
SAEHD: lr_dropout now can be ‘n’, ‘y’, ‘cpu’. ‘n’ and ’y’ are the same as before.
‘cpu’ mean enabled on CPU. This allows not to use extra VRAM, sacrificing 20% time of iteration.
fix errors
 
reduced chance of the error "The paging file is too small for this operation to complete."
 
updated XNViewMP to 0.96.2
 
== 04.06.2020 ==
 
Manual extractor: now you can specify the face rectangle manually using ‘R Mouse button’.
It is useful for small, blurry, undetectable faces, animal faces.

Warning:
Landmarks cannot be placed on the face precisely, and they are actually used for positioning the red frame.
Therefore, such frames must be used only with XSeg workflow !
Try to keep the red frame the same as the adjacent frames.
 
added script
10.misc) make CPU only.bat
This script will convert your DeepFaceLab folder to work on CPU without any problems. An internet connection is required.
It is useful to train on Colab and merge interactively on your comp without GPU.
 
== 31.05.2020 ==
 
XSegEditor: added button "view XSeg mask overlay face"
 
== 06.05.2020 ==
 
Some fixes
 
SAEHD: changed UHD arhis. You have to retrain uhd models from scratch.
 
== 20.04.2020 ==
 
XSegEditor: fix bug
 
Merger: fix bug
 
== 15.04.2020 ==
 
XSegEditor: added view lock at the center by holding shift in drawing mode.
 
Merger: color transfer “sot-m”: speed optimization for 5-10%
 
Fix minor bug in sample loader
 
== 14.04.2020 ==
 
Merger: optimizations
 
        color transfer ‘sot-m’ : reduced color flickering, but consuming x5 more time to process
 
        added mask mode ‘learned-prd + learned-dst’ – produces largest area of both dst and predicted masks
XSegEditor : polygon is now transparent while editing
 
New example data_dst.mp4 video
 
New official mini tutorial / 
== 06.04.2020 ==
 
Fixes for 16+ cpu cores and large facesets.
 
added 5.XSeg) data_dst/data_src mask for XSeg trainer - remove.bat
       removes labeled xseg polygons from the extracted frames
      
 
== 05.04.2020 ==
 
Decreased amount of RAM used by Sample Generator.
 
Fixed bug with input dialog in Windows 10
 
Fixed running XSegEditor when directory path contains spaces
 
SAEHD: ‘Face style power’ and ‘Background style power’  are now available for whole_face
 New help messages for these options.
 
XSegEditor: added button ‘view trained XSeg mask’, so you can see which frames should be masked to improve mask quality.
 
Merger:
added ‘raw-predict’ mode. Outputs raw predicted square image from the neural network.
 
mask-mode ‘learned’ replaced with 3 new modes:
       ‘learned-prd’ – smooth learned mask of the predicted face
       ‘learned-dst’ – smooth learned mask of DST face
       ‘learned-prd*learned-dst’ – smallest area of both (default)
            
 
Added new face type : head
Now you can replace the head.
Example: /      Post processing skill in Adobe After Effects or Davinci Resolve.
Usage:
1)  Find suitable dst footage with the monotonous background behind head
2)  Use “extract head” script
3)  Gather rich src headset from only one scene (same color and haircut)
4)  Mask whole head for src and dst using XSeg editor
5)  Train XSeg
6)  Apply trained XSeg mask for src and dst headsets
7)  Train SAEHD using ‘head’ face_type as regular deepfake model with DF archi. You can use pretrained model for head. Minimum recommended resolution for head is 224.
8)  Extract multiple tracks, using Merger:
a.  Raw-rgb
b.  XSeg-prd mask
c.  XSeg-dst mask
9)  Using AAE or DavinciResolve, do:
a.  Hide source head using XSeg-prd mask: content-aware-fill, clone-stamp, background retraction, or other technique
b.  Overlay new head using XSeg-dst mask
 
Warning: Head faceset can be used for whole_face or less types of training only with XSeg masking.
 
 
 
== 30.03.2020 ==
 
New script:
       5.XSeg) data_dst/src mask for XSeg trainer - fetch.bat
Copies faces containing XSeg polygons to aligned_xseg\ dir.
Useful only if you want to collect labeled faces and reuse them in other fakes.
 
Now you can use trained XSeg mask in the SAEHD training process.
It’s mean default ‘full_face’ mask obtained from landmarks will be replaced with the mask obtained from the trained XSeg model.
use
5.XSeg.optional) trained mask for data_dst/data_src - apply.bat
5.XSeg.optional) trained mask for data_dst/data_src - remove.bat
 
Normally you don’t need it. You can use it, if you want to use ‘face_style’ and ‘bg_style’ with obstructions.
 
XSeg trainer : now you can choose type of face
XSeg trainer : now you can restart training in “override settings”
Merger: XSeg-* modes now can be used with all types of faces.
 
Therefore old MaskEditor, FANSEG models, and FAN-x modes have been removed,
because the new XSeg solution is better, simpler and more convenient, which costs only 1 hour of manual masking for regular deepfake.
 
 
== 27.03.2020 ==
 
XSegEditor: fix bugs, changed layout, added current filename label
 
SAEHD: fixed the use of pretrained liae model, now it produces less face morphing
 
== 25.03.2020 ==
 
SAEHD: added 'dfuhd' and 'liaeuhd' archi
uhd version is lighter than 'HD' but heavier than regular version.
liaeuhd provides more "src-like" result
comparison:
       liae:    /      liaeuhd: / 
 
added new XSegEditor !
 
here new whole_face + XSeg workflow:
 
with XSeg model you can train your own mask segmentator for dst(and/or src) faces
that will be used by the merger for whole_face.
 
Instead of using a pretrained segmentator model (which does not exist),
you control which part of faces should be masked.
 
new scripts:
       5.XSeg) data_dst edit masks.bat
       5.XSeg) data_src edit masks.bat
       5.XSeg) train.bat
 
Usage:
       unpack dst faceset if packed
 
       run 5.XSeg) data_dst edit masks.bat
 
       Read tooltips on the buttons (en/ru/zn languages are supported)
 
       mask the face using include or exclude polygon mode.
      
       repeat for 50/100 faces,
             !!! you don't need to mask every frame of dst
             only frames where the face is different significantly,
             for example:
                    closed eyes
                    changed head direction
                    changed light
             the more various faces you mask, the more quality you will get
 
             Start masking from the upper left area and follow the clockwise direction.
             Keep the same logic of masking for all frames, for example:
                    the same approximated jaw line of the side faces, where the jaw is not visible
                    the same hair line
             Mask the obstructions using exclude polygon mode.
 
       run XSeg) train.bat
             train the model
 
             Check the faces of 'XSeg dst faces' preview.
 
             if some faces have wrong or glitchy mask, then repeat steps:
                    run edit
                    find these glitchy faces and mask them
                    train further or restart training from scratch
 
Restart training of XSeg model is only possible by deleting all 'model\XSeg_*' files.
 
If you want to get the mask of the predicted face (XSeg-prd mode) in merger,
you should repeat the same steps for src faceset.
 
New mask modes available in merger for whole_face:
 
XSeg-prd       - XSeg mask of predicted face  -> faces from src faceset should be labeled
XSeg-dst       - XSeg mask of dst face               -> faces from dst faceset should be labeled
XSeg-prd*XSeg-dst - the smallest area of both
 
if workspace\model folder contains trained XSeg model, then merger will use it,
otherwise you will get transparent mask by using XSeg-* modes.
 
Some screenshots:
XSegEditor: /  : /   : / 
example of the fake using 13 segmented dst faces
          : / 
 
== 18.03.2020 ==
 
Merger: fixed face jitter
 
== 15.03.2020 ==
 
global fixes
 
SAEHD: removed option learn_mask, it is now enabled by default
 
removed liaech arhi
 
removed support of extracted(aligned) PNG faces. Use old builds to convert from PNG to JPG.
 
 
== 07.03.2020 ==
 
returned back
3.optional) denoise data_dst images.bat
       Apply it if dst video is very sharp.
 
       Denoise dst images before face extraction.
       This technique helps neural network not to learn the noise.
       The result is less pixel shake of the predicted face.
      
 
SAEHD:
 
added new experimental archi
'liaech' - made by  Based on liae, but produces more src-like face.
 
lr_dropout is now disabled in pretraining mode.
 
Sorter:
 
added sort by "face rect size in source image"
small faces from source image will be placed at the end
 
added sort by "best faces faster"
same as sort by "best faces"
but faces will be sorted by source-rect-area instead of blur.
 
 
 
== 28.02.2020 ==
 
Extractor:
 
image size for all faces is now 512
 
fix RuntimeWarning during the extraction process
 
SAEHD:
 
max resolution is now 512
 
fix hd arhitectures. Some decoder's weights haven't trained before.
 
new optimized training:
for every <batch_size*16> samples,
model collects <batch_size> samples with the highest error and learns them again
therefore hard samples will be trained more often
 
'models_opt_on_gpu' option is now available for multigpus (before only for 1 gpu)
 
fix 'autobackup_hour'
 
== 23.02.2020 ==
 
SAEHD: pretrain option is now available for whole_face type
 
fix sort by abs difference
fix sort by yaw/pitch/best for whole_face's
 
== 21.02.2020 ==
 
Trainer: decreased time of initialization
 
Merger: fixed some color flickering in overlay+rct mode
 
SAEHD:
 
added option Eyes priority (y/n)
 
       Helps to fix eye problems during training like "alien eyes"
       and wrong eyes direction ( especially on HD architectures )
       by forcing the neural network to train eyes with higher priority.
       before/after / 
added experimental face type 'whole_face'
 
       Basic usage instruction: /     
       'whole_face' requires skill in Adobe After Effects.
 
       For using whole_face you have to extract whole_face's by using
       4) data_src extract whole_face
       and
       5) data_dst extract whole_face
       Images will be extracted in 512 resolution, so they can be used for regular full_face's and half_face's.
      
       'whole_face' covers whole area of face include forehead in training square,
       but training mask is still 'full_face'
       therefore it requires manual final masking and composing in Adobe After Effects.
 
added option 'masked_training'
       This option is available only for 'whole_face' type.
       Default is ON.
       Masked training clips training area to full_face mask,
       thus network will train the faces properly. 
       When the face is trained enough, disable this option to train all area of the frame.
       Merge with 'raw-rgb' mode, then use Adobe After Effects to manually mask, tune color, and compose whole face include forehead.
 
 
 
== 03.02.2020 ==
 
"Enable autobackup" option is replaced by
"Autobackup every N hour" 0..24 (default 0 disabled), Autobackup model files with preview every N hour
 
Merger:
 
'show alpha mask' now on 'V' button
 
'super resolution mode' is replaced by
'super resolution power' (0..100) which can be modified via 'T' 'G' buttons
 
default erode/blur values are 0.
 
new multiple faces detection log: / 
now uses all available CPU cores ( before max 6 )
so the more processors, the faster the process will be.
 
== 01.02.2020 ==
 
Merger:
 
increased speed
 
improved quality
 
SAEHD: default archi is now 'df'
 
== 30.01.2020 ==
 
removed use_float16 option
 
fix MultiGPU training
 
== 29.01.2020 ==
 
MultiGPU training:
fixed CUDNN_STREAM errors.
speed is significantly increased.
 
Trainer: added key 'b' : creates a backup even if the autobackup is disabled.
 
== 28.01.2020 ==
 
optimized face sample generator, CPU load is significantly reduced
 
fix of update preview for history after disabling the pretrain mode
 
 
SAEHD:
 
added new option
GAN power 0.0 .. 10.0
       Train the network in Generative Adversarial manner.
       Forces the neural network to learn small details of the face.
       You can enable/disable this option at any time,
       but better to enable it when the network is trained enough.
       Typical value is 1.0
       GAN power with pretrain mode will not work.
 
Example of enabling GAN on 81k iters +5k iters
/ 
 
dfhd: default Decoder dimensions are now 48
the preview for 256 res is now correctly displayed
 
fixed model naming/renaming/removing
 
 
Improvements for those involved in post-processing in AfterEffects:
 
Codec is reverted back to x264 in order to properly use in AfterEffects and video players.
 
Merger now always outputs the mask to workspace\data_dst\merged_mask
 
removed raw modes except raw-rgb
raw-rgb mode now outputs selected face mask_mode (before square mask)
 
'export alpha mask' button is replaced by 'show alpha mask'.
You can view the alpha mask without recompute the frames.
 
8) 'merged *.bat' now also output 'result_mask.' video file.
8) 'merged lossless' now uses x264 lossless codec (before PNG codec)
result_mask video file is always lossless.
 
Thus you can use result_mask video file as mask layer in the AfterEffects.
 
 
== 25.01.2020 ==
 
Upgraded to TF version 1.13.2
 
Removed the wait at first launch for most graphics cards.
 
Increased speed of training by 10-20%, but you have to retrain all models from scratch.
 
SAEHD:
 
added option 'use float16'
       Experimental option. Reduces the model size by half.
       Increases the speed of training.
       Decreases the accuracy of the model.
       The model may collapse or not train.
       Model may not learn the mask in large resolutions.
       You enable/disable this option at any time.
 
true_face_training option is replaced by
"True face power". 0.0000 .. 1.0
Experimental option. Discriminates the result face to be more like the src face. Higher value - stronger discrimination.
Comparison - / 
== 23.01.2020 ==
 
SAEHD: fixed clipgrad option
 
== 22.01.2020 == BREAKING CHANGES !!!
 
Getting rid of the weakest link - AMD cards support.
All neural network codebase transferred to pure low-level TensorFlow backend, therefore
removed AMD/Intel cards support, now DFL works only on NVIDIA cards or CPU.
 
old DFL marked as 1.0 still available for download, but it will no longer be supported.
 
global code refactoring, fixes and optimizations
 
Extractor:
 
now you can choose on which GPUs (or CPU) to process
 
improved stability for < 4GB GPUs
 
increased speed of multi gpu initializing
 
now works in one pass (except manual mode)
so you won't lose the processed data if something goes wrong before the old 3rd pass
 
Faceset enhancer:
 
now you can choose on which GPUs (or CPU) to process
 
Trainer:
 
now you can choose on which GPUs (or CPU) to train the model.
Multi-gpu training is now supported.
Select identical cards, otherwise fast GPU will wait slow GPU every iteration.
 
now remembers the previous option input as default with the current workspace/model/ folder.
 
the number of sample generators now matches the available number of processors
 
saved models now have names instead of GPU indexes.
Therefore you can switch GPUs for every saved model.
Trainer offers to choose latest saved model by default.
You can rename or delete any model using the dialog.
 
models now save the optimizer weights in the model folder to continue training properly
 
removed all models except SAEHD, Quick96
 
trained model files from DFL 1.0 cannot be reused
 
AVATAR model is also removed.
How to create AVATAR like in this video? / capture yourself with your own speech repeating same head direction as celeb in target video
2) train regular deepfake model with celeb faces from target video as src, and your face as dst
3) merge celeb face onto your face with raw-predict mode
4) compose masked mouth with target video in AfterEffects
 
 
SAEHD:
 
now has 3 options: Encoder dimensions, Decoder dimensions, Decoder mask dimensions
 
now has 4 arhis: dfhd (default), liaehd, df, liae
df and liae are from SAE model, but use features from SAEHD model (such as combined loss and disable random warp)
 
dfhd/liaehd - changed encoder/decoder architectures
 
decoder model is combined with mask decoder model
mask training is combined with face training,
result is reduced time per iteration and decreased vram usage by optimizer
 
"Initialize CA weights" now works faster and integrated to "Initialize models" progress bar
 
removed optimizer_mode option
 
added option 'Place models and optimizer on GPU?'
  When you train on one GPU, by default model and optimizer weights are placed on GPU to accelerate the process.
  You can place they on CPU to free up extra VRAM, thus you can set larger model parameters.
  This option is unavailable in MultiGPU mode.
 
pretraining now does not use rgb channel shuffling
pretraining now can be continued
when pre-training is disabled:
1) iters and loss history are reset to 1
2) in df/dfhd archis, only the inter part of the encoder is reset (before encoder+inter)
   thus the fake will train faster with a pretrained df model
 
Merger ( renamed from Converter ):
 
now you can choose on which GPUs (or CPU) to process
 
new hot key combinations to navigate and override frame's configs
 
super resolution upscaler "RankSRGAN" is replaced by "FaceEnhancer"
 
FAN-x mask mode now works on GPU while merging (before on CPU),
therefore all models (Main face model + FAN-x + FaceEnhancer)
now work on GPU while merging, and work properly even on 2GB GPU.
 
Quick96:
 
now automatically uses pretrained model
 
Sorter:
 
removed all sort by *.bat files except one sort.bat
now you have to choose sort method in the dialog
 
Other:
 
all console dialogs are now more convenient
 
XnViewMP is updated to 0.94.1 version
 
ffmpeg is updated to 4.2.1 version
 
ffmpeg: video codec is changed to x265
 
_internal/vscode.bat starts VSCode IDE where you can view and edit DeepFaceLab source code.
 
removed russian/english manual. Read community manuals and tutorials here
//deep.whitecatchel.ru/literotica/forums/forum-guides-and-tutorials
 
new github page design
 
== 11.01.2020 ==
 
fix freeze on sample loading
 
== 08.01.2020 ==
 
fixes and optimizations in sample generators
 
fixed Quick96 and removed lr_dropout from SAEHD for OpenCL build.
 
CUDA build now works on lower-end GPU with 2GB VRAM:
GTX 880M GTX 870M GTX 860M GTX 780M GTX 770M
GTX 765M GTX 760M GTX 680MX GTX 680M GTX 675MX GTX 670MX
GTX 660M GT 755M GT 750M GT 650M GT 745M GT 645M GT 740M
GT 730M GT 640M GT 735M GT 730M GTX 770 GTX 760 GTX 750 Ti
GTX 750 GTX 690 GTX 680 GTX 670 GTX 660 Ti GTX 660 GTX 650 Ti GTX 650 GT 740
 
== 29.12.2019 ==
 
fix faceset enhancer for faces that contain edited mask
 
fix long load when using various gpus in the same DFL folder
 
fix extract unaligned faces
 
avatar: avatar_type is now only head by default
 
== 28.12.2019 ==
 
FacesetEnhancer now asks to merge aligned_enhanced/ to aligned/
 
fix 0 faces detected in manual extractor
 
Quick96, SAEHD: optimized architecture. You have to restart training.
 
Now there are only two builds: CUDA (based on 9.2) and Opencl.
 
== 26.12.2019 ==
 
fixed mask editor
 
added FacesetEnhancer
4.2.other) data_src util faceset enhance best GPU.bat
4.2.other) data_src util faceset enhance multi GPU.bat
 
FacesetEnhancer greatly increases details in your source face set,
same as Gigapixel enhancer, but in fully automatic mode.
In OpenCL build works on CPU only.
 
before/after / 
== 23.12.2019 ==
 
Extractor: 2nd pass now faster on frames where faces are not found
 
all models: removed options 'src_scale_mod', and 'sort samples by yaw as target'
If you want, you can manually remove unnecessary angles from src faceset after sort by yaw.
 
Optimized sample generators (CPU workers). Now they consume less amount of RAM and work faster.
 
added
4.2.other) data_src/dst util faceset pack.bat
       Packs /aligned/ samples into one /aligned/samples.pak file.
       After that, all faces will be deleted.
 
4.2.other) data_src/dst util faceset unpack.bat
       unpacks faces from /aligned/samples.pak to /aligned/ dir.
       After that, samples.pak will be deleted.
 
Packed faceset load and work faster.
 
 
== 20.12.2019 ==
 
fix 3rd pass of extractor for some systems
 
More stable and precise version of the face transformation matrix
 
SAEHD: lr_dropout now as an option, and disabled by default
When the face is trained enough, you can enable this option to get extra sharpness for less amount of iterations
 
 
added
4.2.other) data_src util faceset metadata save.bat
       saves metadata of data_src\aligned\ faces into data_src\aligned\meta.dat
 
4.2.other) data_src util faceset metadata restore.bat
       restore metadata from 'meta.dat' to images
       if image size different from original, then it will be automatically resized
 
You can greatly enhance face details of src faceset by using Topaz Gigapixel software.
example before/after / it from torrent / of workflow:
 
1) run 'data_src util faceset metadata save.bat'
2) launch Topaz Gigapixel
3) open 'data_src\aligned\' and select all images
4) set output folder to 'data_src\aligned_topaz' (create folder in save dialog)
5) set settings as on screenshot /      you can choose 2x, 4x, or 6x upscale rate
6) start process images and wait full process
7) rename folders:
       data_src\aligned        ->  data_src\aligned_original
       data_src\aligned_topaz  ->  data_src\aligned
8) copy 'data_src\aligned_original\meta.dat' to 'data_src\aligned\'
9) run 'data_src util faceset metadata restore.bat'
       images will be downscaled back to original size (256x256) preserving details
       metadata will be restored
10) now your new enhanced faceset is ready to use !
 
 
 
 
 
== 15.12.2019 ==
 
SAEHD,Quick96:
improved model generalization, overall accuracy and sharpness
by using new 'Learning rate dropout' technique from the paper / example of a loss histogram where this function is enabled after the red arrow:
/ 
 
== 12.12.2019 ==
 
removed FacesetRelighter due to low quality of the result
 
added sort by absdiff
This is sort method by absolute per pixel difference between all faces.
options:
Sort by similar? ( y/n ?:help skip:y ) :
if you choose 'n', then most dissimilar faces will be placed first.
 
'sort by final' renamed to 'sort by best'
 
OpenCL: fix extractor for some amd cards
 
== 14.11.2019 ==
 
Converter: added new color transfer mode: mix-m
 
== 13.11.2019 ==
 
SAE,SAEHD,Converter:
added sot-m color transfer
 
Converter:
removed seamless2 mode
 
FacesetRelighter:
Added intensity parameter to the manual picker.
'One random direction' and 'predefined 7 directions' use random intensity from 0.3 to 0.6.
 
== 12.11.2019 ==
 
FacesetRelighter fixes and improvements:
 
now you have 3 ways:
1) define light directions manually (not for google colab)
   watch demo / relight faceset with one random direction
3) relight faceset with predefined 7 directions
 
== 11.11.2019 ==
 
added FacesetRelighter:
Synthesize new faces from existing ones by relighting them using DeepPortraitRelighter network.
With the relighted faces neural network will better reproduce face shadows.
 
Therefore you can synthsize shadowed faces from fully lit faceset.
/ 
as a result, better fakes on dark faces:
/ 
operate via
data_x add relighted faces.bat
data_x delete relighted faces.bat
 
in OpenCL build Relighter runs on CPU
 
== 09.11.2019 ==
 
extractor: removed "increased speed of S3FD" for compatibility reasons
 
converter:
fixed crashes
removed useless 'ebs' color transfer
changed keys for color degrade
 
added image degrade via denoise - same as denoise extracted data_dst.bat ,
but you can control this option directly in the interactive converter
 
added image degrade via bicubic downscale/upscale
 
SAEHD:
default ae_dims for df now 256. It is safe to train SAEHD on 256 ae_dims and higher resolution.
Example of recent fake: / 
added Quick96 model.
This is the fastest model for low-end 2GB+ NVidia and 4GB+ AMD cards.
Model has zero options and trains a 96pix fullface.
It is good for quick deepfake demo.
Example of the preview trained in 15 minutes on RTX2080Ti:
/ 
== 27.10.2019 ==
 
Extractor: fix for AMD cards
 
== 26.10.2019 ==
 
red square of face alignment now contains the arrow that shows the up direction of an image
 
fix alignment of side faces
Before / / 
fix message when no training data provided
 
== 23.10.2019 ==
 
enhanced sort by final: now faces are evenly distributed not only in the direction of yaw,
but also in pitch
 
added 'sort by vggface': sorting by face similarity using VGGFace model.
Requires 4GB+ VRAM and internet connection for the first run.
 
 
== 19.10.2019 ==
 
fix extractor bug for 11GB+ cards
 
== 15.10.2019 ==
 
removed fix "fixed bug when the same face could be detected twice"
 
SAE/SAEHD:
removed option 'apply random ct'
 
added option
   Color transfer mode apply to src faceset. ( none/rct/lct/mkl/idt, ?:help skip: none )
   Change color distribution of src samples close to dst samples. Try all modes to find the best.
before was lct mode, but sometime it does not work properly for some facesets.
 
 
== 14.10.2019 ==
 
fixed bug when the same face could be detected twice
 
Extractor now produces a less shaked face. but second pass is now slower by 25%
before/after: https://imgur.com/L77puLH
 
SAE, SAEHD: 'random flip' and 'learn mask' options now can be overridden.
It is recommended to start training for first 20k iters always with 'learn_mask'
 
SAEHD: added option Enable random warp of samples, default is on
Random warp is required to generalize facial expressions of both faces.
When the face is trained enough, you can disable it to get extra sharpness for less amount of iterations.
 
== 10.10.2019 ==
 
fixed wrong NVIDIA GPU detection in extraction and training processes
 
increased speed of S3FD 1st pass extraction for GPU with >= 11GB vram.
 
== 09.10.2019 ==
 
fixed wrong NVIDIA GPU indexes in a systems with two or more GPU
fixed wrong NVIDIA GPU detection on the laptops
 
removed TrueFace model.
 
added SAEHD model ( High Definition Styled AutoEncoder )
Compare with SAE: / is a new heavyweight model for high-end cards to achieve maximum possible deepfake quality in 2020.
 
Differences from SAE:
+ new encoder produces more stable face and less scale jitter
+ new decoder produces subpixel clear result
+ pixel loss and dssim loss are merged together to achieve both training speed and pixel trueness
+ by default networks will be initialized with CA weights, but only after first successful iteration
  therefore you can test network size and batch size before weights initialization process
+ new neural network optimizer consumes less VRAM than before
+ added option <Enable 'true face' training>
  The result face will be more like src and will get extra sharpness.
  Enable it for last 30k iterations before conversion.
+ encoder and decoder dims are merged to one parameter encoder/decoder dims
+ added mid-full face, which covers 30% more area than half face. 
 
example of the preview trained on RTX2080TI, 128 resolution, 512-21 dims, 8 batch size, 700ms per iteration:
without trueface            : / trueface    +23k iters : / 
== 24.09.2019 ==
 
fix TrueFace model, required retraining
 
== 21.09.2019 ==
 
fix avatar model
 
== 19.09.2019 ==
 
SAE : WARNING, RETRAIN IS REQUIRED !
fixed model sizes from previous update.
avoided bug in ML framework(keras) that forces to train the model on random noise.
 
Converter: added blur on the same keys as sharpness
 
Added new model 'TrueFace'. Only for NVIDIA cards.
This is a GAN model ported from / produces near zero morphing and high detail face.
Model has higher failure rate than other models.
It does not learn the mask, so fan-x mask modes should be used in the converter.
Keep src and dst faceset in same lighting conditions.
 
== 13.09.2019 ==
 
Converter: added new color transfer modes: mkl, mkl-m, idt, idt-m
 
SAE: removed multiscale decoder, because it's not effective
 
== 07.09.2019 ==
 
Extractor: fixed bug with grayscale images.
 
Converter:
 
Session is now saved to the model folder.
 
blur and erode ranges are increased to -400+400
 
hist-match-bw is now replaced with seamless2 mode.
 
Added 'ebs' color transfer mode (works only on Windows).
 
FANSEG model (used in FAN-x mask modes) is retrained with new model configuration
and now produces better precision and less jitter
 
== 30.08.2019 ==
 
interactive converter now saves the session.
if input frames are changed (amount or filenames)
then interactive converter automatically starts a new session.
if model is more trained then all frames will be recomputed again with their saved configs.
 
== 28.08.2019 ==
 
removed landmarks of lips which are used in face aligning
result is less scale jittering
before  / 
after   / 
converter: fixed merged\ filenames, now they are 100% same as input from data_dst\
 
converted to X.bat : now properly eats any filenames from merged\ dir as input
 
== 27.08.2019 ==
 
fixed converter navigation logic and output filenames in merge folder
 
added EbSynth program. It is located in _internal\EbSynth\ folder
Start it via 10) EbSynth.bat
It starts with sample project loaded from _internal\EbSynth\SampleProject
EbSynth is mainly used to create painted video, but with EbSynth you can fix some weird frames produced by deepfake process.
before: / 
after:  / tutorial for EbSynth : / 
== 26.08.2019 ==
 
updated pdf manuals for AVATAR model.
 
Avatar converter: added super resolution option.
 
All converters:
fixes and optimizations
super resolution DCSCN network is now replaced by RankSRGAN
added new option sharpen_mode and sharpen_amount
 
== 25.08.2019 ==
 
Converter: FAN-dst mask mode now works for half face models.
 
AVATAR Model: default avatar_type option on first startup is now HEAD.
Head produces much more stable result than source.
 
updated usage of AVATAR model:
Usage:
1) place data_src.mp4 10-20min square resolution video of news reporter sitting at the table with static background,
   other faces should not appear in frames.
2) process "extract images from video data_src.bat" with FULL fps
3) place data_dst.mp4 square resolution video of face who will control the src face
4) process "extract images from video data_dst FULL FPS.bat"
5) process "data_src mark faces S3FD best GPU.bat"
6) process "data_dst extract unaligned faces S3FD best GPU.bat"
7) train AVATAR.bat stage 1, tune batch size to maximum for your card (32 for 6GB), train to 50k+ iters.
8) train AVATAR.bat stage 2, tune batch size to maximum for your card (4 for 6GB), train to decent sharpness.
9) convert AVATAR.bat
10) converted to mp4.bat
 
== 24.08.2019 ==
 
Added interactive converter.
With interactive converter you can change any parameter of any frame and see the result in real time.
 
Converter: added motion_blur_power param.
Motion blur is applied by precomputed motion vectors.
So the moving face will look more realistic.
 
RecycleGAN model is removed.
 
Added experimental AVATAR model. Minimum required VRAM is 6GB for NVIDIA and 12GB for AMD.
 
 
== 16.08.2019 ==
 
fixed error "Failed to get convolution algorithm" on some systems
fixed error "dll load failed" on some systems
 
model summary is now better formatted
 
Expanded eyebrows line of face masks. It does not affect mask of FAN-x converter mode.
ConverterMasked: added mask gradient of bottom area, same as side gradient
 
== 23.07.2019 ==
 
OpenCL : update versions of internal libraries
 
== 20.06.2019 ==
 
Trainer: added option for all models
Enable autobackup? (y/n ?:help skip:%s) :
Autobackup model files with preview every hour for last 15 hours. Latest backup located in model/<>_autobackups/01
 
SAE: added option only for CUDA builds:
Enable gradient clipping? (y/n, ?:help skip:%s) :
Gradient clipping reduces chance of model collapse, sacrificing speed of training.
 
== 02.06.2019 ==
 
fix error on typing uppercase values
 
== 24.05.2019 ==
 
OpenCL : fix FAN-x converter
 
== 20.05.2019 ==
 
OpenCL : fixed bug when analysing ops was repeated after each save of the model
 
== 10.05.2019 ==
 
fixed work of model pretraining
 
== 08.05.2019 ==
 
SAE: added new option
Apply random color transfer to src faceset? (y/n, ?:help skip:%s) :
Increase variativity of src samples by apply LCT color transfer from random dst samples.
It is like 'face_style' learning, but more precise color transfer and without risk of model collapse,
also it does not require additional GPU resources, but the training time may be longer, due to the src faceset is becoming more diverse.
 
== 05.05.2019 ==
 
OpenCL: SAE model now works properly
 
== 05.03.2019 ==
 
fixes
 
SAE: additional info in help for options:
 
Use pixel loss - Enabling this option too early increases the chance of model collapse.
Face style power - Enabling this option increases the chance of model collapse.
Background style power - Enabling this option increases the chance of model collapse.
 
 
== 05.01.2019 ==
 
SAE: added option 'Pretrain the model?'
 
Pretrain the model with large amount of various faces.
This technique may help to train the fake with overly different face shapes and light conditions of src/dst data.
Face will be look more like a morphed. To reduce the morph effect,
some model files will be initialized but not be updated after pretrain: LIAE: inter_AB.h5 DF: encoder.h5.
The longer you pretrain the model the more morphed face will look. After that, save and run the training again.
 
 
== 04.28.2019 ==
 
fix 3rd pass extractor hang on AMD 8+ core processors
 
Converter: fixed error with degrade color after applying 'lct' color transfer
 
added option at first run for all models: Choose image for the preview history? (y/n skip:n)
Controls: [p] - next, [enter] - confirm.
 
fixed error with option sort by yaw. Remember, do not use sort by yaw if the dst face has hair that covers the jaw.
 
== 04.24.2019 ==
 
SAE: finally the collapses were fixed
 
added option 'Use CA weights? (y/n, ?:help skip: %s ) :
Initialize network with 'Convolution Aware' weights from paper / may help to achieve a higher accuracy model, but consumes a time at first run.
 
== 04.23.2019 ==
 
SAE: training should be restarted
remove option 'Remove gray border' because it makes the model very resource intensive.
 
== 04.21.2019 ==
 
SAE:
fix multiscale decoder.
training with liae archi should be restarted
 
changed help for 'sort by yaw' option:
NN will not learn src face directions that don't match dst face directions. Do not use if the dst face has hair that covers the jaw.
 
 
== 04.20.2019 ==
 
fixed work with NVIDIA cards in TCC mode
 
Converter: improved FAN-x masking mode.
Now it excludes face obstructions such as hair, fingers, glasses, microphones, etc.
example / works only for full face models, because there were glitches in half face version.
 
Fanseg is trained by using manually refined by MaskEditor >3000 various faces with obstructions.
Accuracy of fanseg to handle complex obstructions can be improved by adding more samples to dataset, but I have no time for that :(
Dataset is located in the official mega.nz folder.
If your fake has some complex obstructions that incorrectly recognized by fanseg,
you can add manually masked samples from your fake to the dataset
and retrain it by using --model DEV_FANSEG argument in bat file. Read more info in dataset archive.
Minimum recommended VRAM is 6GB and batch size 24 to train fanseg.
Result model\FANSeg_256_full_face.h5 should be placed to DeepFacelab\facelib\ folder
 
Google Colab now works on Tesla T4 16GB.
With Google Colaboratory you can freely train your model for 12 hours per session, then reset session and continue with last save.
more info how to work with Colab: / 
== 04.07.2019 ==
 
Extractor: added warning if aligned folder contains files that will be deleted.
 
Converter subprocesses limited to maximum 6
 
== 04.06.2019 ==
 
added experimental mask editor.
It is created to improve FANSeg model, but you can try to use it in fakes.
But remember: it does not guarantee quality improvement.
usage:
run 5.4) data_dst mask editor.bat
edit the mask of dst faces with obstructions
train SAE either with 'learn mask' or with 'style values'
Screenshot of mask editor: / of training and merging using edited mask: / masks are harder to train.
 
SAE:
previous SAE model will not work with this update.
Greatly decreased chance of model collapse.
Increased model accuracy.
Residual blocks now default and this option has been removed.
Improved 'learn mask'.
Added masked preview (switch by space key)
 
Converter:
fixed rct/lct in seamless mode
added mask mode (6) learned*FAN-prd*FAN-dst
 
changed help message for pixel loss:
Pixel loss may help to enhance fine details and stabilize face color. Use it only if quality does not improve over time.
 
fixed ctrl-c exit in no-preview mode
 
== 03.31.2019 ==
 
Converter: fix blur region of seamless.
 
== 03.30.2019 ==
 
fixed seamless face jitter
removed options Suppress seamless jitter, seamless erode mask modifier.
seamlessed face now properly uses blur modifier
added option 'FAN-prd&dst' - using multiplied FAN prd and dst mask,
 
== 03.29.2019 ==
 
Converter: refactorings and optimizations
added new option
Apply super resolution? (y/n skip:n) : Enhance details by applying DCSCN network.
before/after gif - / 
== 03.26.2019 ==
 
SAE: removed lightweight encoder.
optimizer mode now can be overriden each run
 
Trainer: the loss line now shows the average loss values after saving
 
Converter: fixed bug with copying files without faces.
 
XNViewMP : updated version
 
fixed cut video.bat for paths with spaces
 
== 03.24.2019 ==
 
old SAE model will not work with this update.
 
Fixed bug when SAE can be collapsed during a time.
 
SAE: removed CA weights and encoder/decoder dims.
 
added new options:
 
Encoder dims per channel (21-85 ?:help skip:%d)
More encoder dims help to recognize more facial features, but require more VRAM. You can fine-tune model size to fit your GPU.
 
Decoder dims per channel (11-85 ?:help skip:%d)
More decoder dims help to get better details, but require more VRAM. You can fine-tune model size to fit your GPU.
 
Add residual blocks to decoder? (y/n, ?:help skip:n) :
These blocks help to get better details, but require more computing time.
 
Remove gray border? (y/n, ?:help skip:n) :
Removes gray border of predicted face, but requires more computing resources.
 
 
Extract images from video: added option
Output image format? ( jpg png ?:help skip:png ) :
PNG is lossless, but produces images with size x10 larger than JPG.
JPG extraction is faster, especially on HDD instead of SSD.
 
== 03.21.2019 ==
 
OpenCL build: fixed, now works on most video cards again.
 
old SAE model will not work with this update.
Fixed bug when SAE can be collapsed during a time
 
Added option
Use CA weights? (y/n, ?:help skip: n ) :
Initialize network with 'Convolution Aware' weights.
This may help to achieve a higher accuracy model, but consumes time at first run.
 
Extractor:
removed DLIB extractor
greatly increased accuracy of landmarks extraction, especially with S3FD detector, but speed of 2nd pass now slower.
From this point on, it is recommended to use only the S3FD detector.
before / / 
Converter: added new option to choose type of mask for full-face models.
 
Mask mode: (1) learned, (2) dst, (3) FAN-prd, (4) FAN-dst (?) help. Default - 1 :
Learned – Learned mask, if you choose option 'Learn mask' in model. The contours are fairly smooth, but can be wobbly.
Dst – raw mask from dst face, wobbly contours.
FAN-prd – mask from pretrained FAN model from predicted face. Very smooth not shaky countours.
FAN-dst – mask from pretrained FAN model from dst face. Very smooth not shaky countours.
Advantages of FAN mask: you can get a not wobbly shaky without learning it by model.
Disadvantage of FAN mask: may produce artifacts on the contours if the face is obstructed.
 
== 03.13.2019 ==
 
SAE: added new option
 
Optimizer mode? ( 1,2,3 ?:help skip:1) :
this option only for NVIDIA cards. Optimizer mode of neural network.
1 - default.
2 - allows you to train x2 bigger network, uses a lot of RAM.
3 - allows you to train x3 bigger network, uses huge amount of RAM and 30% slower.
 
Epoch term renamed to iteration term.
 
added showing timestamp in string of training in console
 
== 03.11.2019 ==
 
CUDA10.1AVX users - update your video drivers from geforce.com site
 
face extractor:
 
added new extractor S3FD - more precise, produces less false-positive faces, accelerated by AMD/IntelHD GPU (while MT is not)
 
speed of 1st pass with DLIB significantly increased
 
decreased amount of false-positive faces for all extractors
 
manual extractor: added 'h' button to hide the help information
 
fix DFL conflict with system python installation
 
removed unwanted tensorflow info from console log
 
updated manual_ru
 
== 03.07.2019 ==
 
fixes
 
upgrade to python 3.6.8
 
Reorganized structure of DFL folder. Removed unnecessary files and other trash.
 
Current available builds now:
 
DeepFaceLabCUDA9.2SSE - for NVIDIA cards up to GTX10x0 series and any 64-bit CPU
DeepFaceLabCUDA10.1AVX - for NVIDIA cards up to RTX and CPU with AVX instructions support
DeepFaceLabOpenCLSSE - for AMD/IntelHD cards and any 64-bit CPU
 
== 03.04.2019 ==
 
added
4.2.other) data_src util recover original filename.bat
5.3.other) data_dst util recover original filename.bat
 
== 03.03.2019 ==
 
Convertor: fix seamless
 
== for older changelog see github page ==

TMBDF · Feb 10, 2020

DFL 2.0 Frequently asked questions - workflow tips, methods and techniques.

Use ctrl+f to find what you are looking for, scroll down a bit for dedicated XSeg section of the FAQ.

1.Q: What's the difference between 1.0 and 2.0?

A: 2.0 is an improved and more optimized version, because of the optimization it offers better performance which means you can train higher resolution models or train existing ones faster. Merging and extraction is also significantly faster.

2. Q: How long does it take to make a deepfake?

A: Depending on how long your target (DST) video is, how large your SRC dataset/faceset is and what kind of model you are using to train your fake as well as your hardware (GPU). It might take anywhere from a day for a simple, short full face video with no xseg masking assuming you have SRC set and a pretrained model, up to a week or more if you are working on new SRC set, pretraining new model, using XSeg, etc.

3. Q: Can you make a deepfake video with just a few pictures?

A: No. Use SimSwap for that:

4. Q: What is the ideal faceset size?

A: For the data_src (celebrity) faceset, It's recommend to have at least 5000-8000 different images but you may end up with larger set, up to 15-25k depending on how univeral you are trying to make it, especially for RTM training it's important to have as many different angles, expressions and lighting conditions as possible.

5. Q: Why are my deepfakes turning out blurry?

A: There are many reasons for blurry faces. Please consult the guide, follow the example workflows and learn what all the basic options do, also read up on SRC dataset creation process in the SRC step.
Most often blurry faces are as a result of poor SRC set that doesn't cover all angles, expressions and lighting conditions but the issue might also be due to use of Quick96 or low res SAEHD model or due to training it incorrectly.

6. Q: Why is my result face not blinking/eyes look wrong/are cross-eyed?

A: This is most likely due to the lack of images in your data_src containing faces with closed eyes or with eyes looking in specific directions on some or all angles.

Make sure you have a decent amount of different facial expressions at all possible angles to match the expressions and angles of faces in the destination/target video - that includes faces with closed eyes and looking in different directions, without those the model doesn't know how face's eyes should look like, resulting in eyes not opening or looking all wrong.

Another cause for this might be running training with wrong settings or decreased dims settings.

7. Q: When should I stop training?

A: There is no correct answer, but the general consensus is to use the preview window and observe how loss values are decreasing to judge when to stop training and start merging. There is no exact iteration number or loss value where you should stop training.
As long as you can see the result improving in the preview and loss is decreasing you should keep training, depending on size of datasets, how well SRC set covers the DST set, whether model is pretrained or not it might take anywhere from 250.000 to 400.000 iterations or 2-4 days first time and perhaps a bit less next time, RTM models are trained for weeks but they can adapt to new DST in as little as few hours.

8. Q: When should I enable or disable random warp, GAN, True Face, Style Power, Color Transfer and Learning Rate Dropout?

A: There is no correct answer as to when as it depends on how well your model is trained and performing in a given time, but there is a correct order of enabling/disabling them. Please reffer to the guide for example workflow and learn what each option does, in most cases these settings have to be enabled depending on your needs, especially True Face, Style Power.

9. Q: DFL isn't working at all (extraction, training, merging) and/or I'm getting an error message.

There are many reasons why DFL might not work, from issues with your PC, drivers, windows, DFL itself down to models or user errors.

If you are trying to run DFL for the first time:

1. Check if your GPU is supported, DFL requires CUDA compute capability of 3.0:

2. Download newest version of DFL for your GPU version, check step 0 - introduction for more info
3. Enabled Hardware Accelerate GPU Scheduling under Windows 10/11.

If you still have issues:

4. If even basic features are still not working then make both your GPU drivers and Windows are up to date. For Nvidia use Studio drivers. For both Nvidia and AMD you can try earlier version of standard gaming drivers.
5. Enable Hardware Accelerated GPU Scheduling under Windows 10/11.

If DFL works but you get errors when trying to train the model:

6. Check if the model you are trying to run is still compatible, the easiest way is to try run a new model with the same parameters (adjust batch size to a low value like 2-4 for testing purposes), if it runs the your PC and DFL is fine but the model is not working, if it doesn't work and you get an error READ IT, if it's some kind of memory error (OOM, out of memory) then the model most likely is too heavy for your GPU, in that case check what models others are able to run and adjust your model settings like batch size, resolution, dims, architecture.

DFL model settings and performance: //deep.whitecatchel.ru/literotica/forums/thread-sharing-deepfacelab-2-0-model-settings-and-performance-sharing

If then it's still not working you can create a new thread in the question section, but before you do that check the issues tab on github to see if other users aren't experiencing the same issues:

If you can't find anything and you've searched forum for similar issues make a new thread here: //deep.whitecatchel.ru/literotica/forums/forum-questions or post about it here.

10. I'm getting OOM/out of memory errors while training SAEHD.

If you are getting OOM errors it means you are running out of VRAM, adjust your model settings. Lower batch size, dims, use less demanding architecture, run LRD and optimizer on CPU, don't use adabelief, all of those affect VRAM usage as do settings like GAN and Style Power.

11. Q: I've turned off all additional features, lowered dims, batch size and the training is still giving me OOM errors even at low batch size.

A: Buy a better GPU or check if it's actually OOM or other memory realated error, perhaps your page file size is too small or the model is corrupter/incompatible.

12. Q: I have too many similar faces in my source dataset, is there a tool I can use to remove them?

A: Yes, you can either use DFL built in sorting methods or use software like VisiPics or DupeGuru to detect similar looking faces in your source dataset and remove them or move to different folder in case you want to add some faces back (similarity detection software like these two tend to remove too many faces at higher difference values that have different eye directions or mouth openess, this can lead to sets that miss some expressions or eye direction, you should go through removed duplicates and bring some faces back to the main set).

13. Q: I was training my model that already had several thousands iterations but the faces in preview window suddenly turned black/white/look weird, my loss values went up/are at zero.

A: Your model has collapsed, it means you cannot use it anymore and you have to start all over or if you had backups, use them.
To prevent model collapsing use gradient clipping or enable backups, usually models don't collapse unless you use style power (in that case enabling gradient clipping is highly recommended) but if you're afraid of it collapsing anyway even without any other features enabled you can leave it enabled all the time at the small performance impact (might be hard to notice, about 50-100ms extra at most according to my tests).

14. Q: Can I reuse my models? Should the same model be always reused for every project or should I always start with fresh base model (pretrained, trained) when doing new SRC and reuse such model only when changing DST? Can models be reused if I'm doing a new SRC but the same DST?

A: Yes, it is actually recommended to reuse you models if you plan on making more fakes of the same SRC with new DST but you can also resuse model when working with different SRC and the same or different DST as long as they look similar. In most cases, even when doing the same SRC and new DST you will have to re-enable RW for the model to train correctly and get best results (applies to all architectures except for RTM (ready to merge) style LIAE-UD/-UDT models). It's best to create a base model (pretrained or trained on random SRC and DST) and use it as a starting point for every new SRC you do and then reusing such models only with the same SRC but different DST.

15. Q: Should I pretrain my models?

A: As with reusing, yes, you should pretrain.

Use the built in pretrain function inside DFL which you can select when starting up a model. It is the correct way to pretrain your model, run this feature for anywhere from 200k to 400k iterations and turn it off once you want to finish pretraining.

16. Q: I'm getting an error: is not a dfl image file required for training in DeepFaceLab

A: It means that the pictures inside data_src/aligned and/or data_dst are not valid for training in DFL.

This can be caused be several things:

1. You are using one of the shared datasets of a celebrity, chances are they were made in a different software than DFL or in older version of it, even though the look like aligned faces (256x256 images) they may be just pictures extracted in different app that stored landmarks/alignment data in different way. To fix them all you need to is to just run alignment process on them, just place them into a "data_src" folder (not "aligned" folder inside it) and align them again by using 4) data_src extract faces S3FD

2. You edited faces/images inside aligned folder of data_src or data_dst in gimp/photoshop after aligning.

When you edit those images you overwrite landmarks/alignments data that is stored inside them.

If you want to edit these images first run 4.2) data_src util faceset metadata save to save alignment info in a separate file, then edit your images and run 4.2) data_src util faceset metadata restore to restore that data.

Only edits allowed are AI up-scaling/enhancing (which you can now also do using 4.2) data_src util faceset enhance instead of using external apps like Gigapixel), color correction or edits to the face that don't change it's shape (like removing or adding stuff), no flipping/mirroring or rotation is allowed.

3. You have regular, non extracted/aligned images in your "data_src/dst" or "aligned" folder.

4. You have _debug faces in your "data_src/aligned" folder. Delete them.

17. Q: I'm getting errors during conversion: no faces found for XYZ.jpg/png, copying without faces.

A: It means that for XYZ frame in "data_dst" folder no faces were extracted into "aligned" folder.

This may be because there were actually no faces visible in that frame (which is normal) or they were visible but due to an angle at which they were or obstruction they were not detected.

To fix that you need to extract those faces manually. Check the main guide, especially the section on cleaning up your data_dst dataset.

Overall you should make sure that you have as many faces aligned and properly extracted BEFORE starting to train.

And remember that both datasets should be cleaned up before training, to know more check the first post (guide) and also read this thread about preparing source datasets for use in training and for sharing on our forum: //deep.whitecatchel.ru/literotica/forums/thre...set-creation-how-to-create-celebrity-facesets

18. Q: I'm getting errors: Warning: several faces detected. Highly recommended to treat them separately and Warning: several faces detected. Directional Blur will not be used. during conversion

A: It's caused by multiple faces within your data_dst/aligned folder.

The extraction process attemps to detect face in each frame at all cost. If it does detect multiple faces or one real face and falsely detects something else as a face it creates multiple files for each frames that look like this: 0001_0.jpg 0001_1.jpg 0001_2.jpg (in case of detecting 3 faces).

19. Q: After merging I see original/DST faces on some or all merged frames.

A: Make sure your converter mode is set to overlay or any other mode except for "original" and make sure you've aligned faces from all frames of your data_dst.mp4 file.

If you only see original faces on some frames, it's because they were not detected/aligned from those corresponding frames, it may happen due to various reasons: extreme angle where it's hard to see the face, blur/motion blur, obstructions, etc. Overall you want to always have all faces from your data_dst.mp4 aligned.

20. Q: What do those 0.2513 0.5612 numbers mean when training?

A: These are loss values. They indicate how well model is trained.

But you shouldn't focus on them unless you see sudden spikes in their value (up or down) after they already settled around some value (assuming you didn't change any model parameters), instead focus on preview windows and look for details like teeth separation, beauty marks, nose, eyes, if they are sharp and look good, then you don't have to worry about anything. If you notice loss values going up for some reason despite not changing any values consider stopping training and resuming it with gradient clipping or disabling some additional options that you might have enabled at the wrong setting which is now causing issues.

21. Q: What are the ideal loss values, how low/high loss values should be?

A: It all depends on the settings, datasets and various different factors.

Generally you want to start training with all features disabled except for random warp of samples (and optionally gradient clipping to prevent model collapses and random flip in case your source dataset is lacking some face/head angles) to a loss under 0.4-0.5 (depending on the model architecture and whether it's a full face model that runs with masked training disabled or whole face/head model that runs with masked training enabled as well as models resolution or model dimensions).

After you disable random warp model should be able to reach loss values between 0.15 and 0.25.

In some cases your model might get stuck at certain loss value or never reach lower one.

22. Q: My model has collapsed, can I somehow recover it?

A: No, you need to start over, or use backup if you made them.

23. Q: What to do if you trained with a celebtity faceset and you want to add more faces/images/frames to it? How to add more variety to existing src/source/celebrity dataset?

A: Safest way is to change the name of the entire "data_src" folder to anything else or to temporarily move it somewhere else, then just extract frames from new data_src.mp4 file or if you already have the frames extracted and some pictures ready, create a new folder "data_src", copy them inside it and run data_src extraction/aligning process, then just copy aligned images from the old data_src/aligned folder into the new one and upon being asked by windows to replace or skip, select the option to rename files so you keep all of them and not end up replacing old ones with new ones.

24. Q: Does the dst faceset/data_dst.mp4 also need to be sharp and high quality? Can some faces in dst faceset/dataset/data_dst be a bit blurry/have shadows, etc? What to do with blurry faces in my data_dst/aligned folder

A: You want your data_dst to be as sharp and free of any motion blur as possible. Blurry faces in data_dst can cause a couple issues:

- first is that some of the faces in certain frames will not get detected - this will cause original faces to be shown on these frames when converting/merging because they couldn't be properly aligned during extraction so you will have to extract them manually.

- second is that others may be incorrectly aligned - this will cause final faces on this frames to be rotated/blurry and just look all wrong and similar to other blurry faces will have to be manually aligned to be used in training and conversion.

- third - even with manual aligning in some cases it may not be possible to correctly detect/align faces which again - will cause original faces to be visible on corresponding frames.

- faces that contain motion blur or are blurry (not sharp) that are correctly aligned may still produce bad results because the models that are used in training cannot understand motion blur, certain parts of the face like mouth when blurred out may appear bigger/wider or just different and the model will interpret this as a change of the shape/look of that part and thus both the predicted and the final faked face will look unnatural.

You should remove those blurry faces from training dataset (data_dst/aligned folder) and put them aside somewhere else and then copy them back into data_dst/aligned folder before converting so that we get the swapped face to show up on frames corresponding to those blurry faces.

To combat the odd look on face in motion you can use motion blur within the merger (but not it will only work if one set of faces is in the "data_dst/aligned" folder and all files end with _0 prefix).

You want both your SRC datasets and DST datasets to be as sharp and high quality as possible.

Small amount of blurriness on some frames shouldn't cause many issues. As for shadows, this depends on how much shadow we are talking about, small, light shadows will probably not be visible, you can get good results with shadows on faces but to much will also look bad, you want your faces to be lit as evenly as possible with as little of harsh/sharp and dark shadows as possible.

25. Q: I'm getting error reference_file not found when I try to convert my deepfake back into mp4 with 8) converted to mp4.

A: You are missing data_dst.mp4 file in your "workspace" folder, check if it wasn't deleted:

Reason why you need it is that even though you separated it into individual frames with 3) extract images from video data_dst FULL FPS all there is inside "data_dst" folder is just frames of the video, you also need sound, which is taken from the original data_dst.mp4 file.

26. Q: I accidentally deleted my data_dst.mp4 file and cannot recover it, can I still turn merged/converted frames into an mp4 video?

A: Yes, in case you've permanently deleted data_dst.mp4 and you have no way of recovering it or rendering identical file you can still convert it back into mp4 (albeit without sound) manually by using ffmpeg and a proper command:

Visit to learn about ffmpeg and how to prepare correct command line to run. Alternatively you can import all frames into a video editing software and render it back into video.

27. Q: Can you pause merging and resume it later? Can you save merger settings? My merging failed/I got error during merging and it's stuck at %, can I start it again and merge from last successfully merged frame?

A: Yes, by default interactive converter/merger creates session file in the "model" folder that saves both progress and settings.

If you want to just pause the training you can hit > and it will pause. If however you need to turn it off completely/restart pc, etc you exit from merger with esc and wait for it to save your progress, next time you launch merging, after selecting interactive merger/converter (Y/N) - Y you'll get a prompt asking if you want to use the save/session file and resume the progress, merger will load with the right settings at the right frame.

If your merging failed and it didn't save the progress you will have to resume it manually, you do it by first backing up your "data_dst" folder and then deleting all extracted frames inside data_dst as well as all images from "aligned" folder inside "data_dst" that correspond to frames already converted/merged inside folder "merged". Then just start merger/converter, enter settings you used before and convert rest of frames, then combine new merged frames with old ones from the backup "data_dst" folder and convert to .mp4 as usual.

28. Q: Faces in preview during training look good but after converting them they look bad. I see parts of the original face (chin, eyebrows, double face outline).

A: Faces in preview are the raw output of the AI that then need to be composited over the original footage.
Because of it, when faces have different shapes, or are slightly smaller/bigger you may see parts of the original face around/outside the mask that DFL merger creates.

To fix it you need to change conversion settings, start by:

- adjusting the mask type

- adjust mask erosion (size) and blur (feathering, smoothing the edge)

- adjust face size (scale)

NOTE: Negative erosion increases the mask size (covers more), positive decreases it.

29. Q: Final result/deepfake has weird artifacts, face changes colors, color bleed from background and make it flicker/darken/change color in the corners/on the edges when using Seamless mode.

A: You are using seamless/hist/seamless+hist overlay mode or you trained your model with source dataset/faceset with varying lighting conditions and didn't use any color transfer during training.

- use overlay or any other mode besides seamless/hist/seamless+hist

- if you want to use seamless:

- decrease size of the mask/face so it doesn't "touch" areas outside and doesn't as a result get the color of background/area outside of the face/head by increasing "Erode Mask" value.

- or smooth out the edge of the mask/face by increasing "Blur Mask" value which may hide some of the color changes, also helps make the face seem more... "seamless" when you decrease mask size.

Both of these may or may not fix the issue, if still persist use simple overlay mode as stated above.

If your source dataset contained images of faces with varying lighting conditions and didn't use color transfer you may need to go back and keep training some more with color transfer enabled.

In case turning it on severely washes out colors or affects colors of training data/faces in a bad way (washed out colors, wrong colors, over saturated colors, noise) or makes the learned face blurry (due to too much variations that the model must learn all over as if there were new faces in your source and destination dataset) you may want to save landmarks data and edit your source dataset colors to better match your destination dataset and also have less variation.

I recommend to NOT use seamless unless it's absolutely needed and even then I recommend stopping on every major angle and camera shift/light change to see if it doesn't cause those artifacts.

30. Q: What's the difference between half face, mid-half face, full face and whole face face_type modes?

A: Whole face is a new mode that covers entire face/head, that means it also covers entire forehead and even some hair and other features that could be cut of by the full face mode and would definitely never be visible when using mid-half or half face mode. It also comes with new option during training that let's you train the forehead called masked_training. First you start with it enabled and it clips the training mask to full face area, once face is trained sufficiently you disable it and it trains the whole face/head. This mode requires either manual masking in post or training your own XSeg model:

//deep.whitecatchel.ru/literotica/forums/thre...g-model-training-and-faceset-masking-tutorial

Full face is a recommended face_type mode to get as much coverage of face as possible without anything that's not needed (hairline, forehead and other parts of the head)

Half face mode was a default face_type mode in H64 and H128 models. It covers only half of the face (from mouth to a bit below eyebrows)

Mid-half face is a mode that covers around 30% larger area than half face.

31. Q: What is the best GPU for deepfakes? I want to upgrade my gpu, which one should I get?

A: Answer to this will change as deepfaking software gets further developed and GPUs become more powerful but for now the best GPU is the one that has most VRAM and is generally fast.

For performance figures check our SAE spreadsheet: //deep.whitecatchel.ru/literotica/forums/thread-dfl-2-0-user-model-settings-spreadsheet

Bear in mind that training performance depends on settings used during training, a full enabled (all features on) 128 DF model may run slower than an 192 DFHD model with turned down dims and all features disabled.

32. Q: What do the AutoEncoder, Encoder, Decoder and D_Mask_Decoder dims settings do? What does changing them does?

A: AutoEncoder, Encoder, Decoder and D_Mask_Decoder dims affect models neural network dimensions.

They can be changed to either increase performance or quality, setting them to high will make models really hard to train (slow, high vram usage) but will give more accurate results and more src like looking face, set it to low and performance will increase but the results will be less accurate and model may not learn certain features of the faces, resulting in generic output that looks more like dst or nothing like either dst or src.

AutoEncoder dimensions ( 32-1024 ?:help ) : this is the overall model capacity to learn.

Too low value and it won't be able to learn everything - higher value will make model be able to learn more expressions and be more accurate at the cost of performance.

Encoder dimensions ( 16-256 ?:help ) : this affects the ability of the model to learn different expressions, states of the face, angles, lighting conditions.

Too low value and model may not be able to learn certain expressions, model might not be closing eyes, mouth, some angles may be less detailed accurate, higher value will lead to more accurate and expressive model assuming AE dims will be increased accordingly at the cost of performance.

Decoder dimensions ( 16-256 ?:help ) : this affects the ability of the model to learn fine detail, textures, teeth, eyes - small things that make face detailed and recognizable.

Too low value will cause some details to not be learned (such as teeth and eyes looking blurry, lack of texture), also some subtle expressions and facial features/texture may not be learned properly, resulting in less src like looking face, higher value will make the face more detailed and model will be able to pick up more of those subtle details at the cost of performance.

Decoder mask dimensions ( 16-256 ?:help ) : affects quality of the learned mask when training with Learn mask enabled. Does not affect the quality of training.

33. Q: Whats the recommended batch size? How high should I set the batch size? How low can batch size be set?

A: There is no recommended batch size but the reasonable value is between 8-12, with values above 16-22 being exceptionally good and 4-6 being a minimum.

Batch size of 2 is not enough to correctly train a model so value of 4 is the recommended minimum, the higher the value the better but at some point higher batch size may not be beneficial, especially if your iteration time starts to increase or you have to disable models_opt_on_gpu - and thus forcing optimizer on CPU which slows down training/increases iteration time.

You can calculate when increasing batch size is becoming less efficient by dividing iteration time by the batch size. Choose that batch size that gives you lower ms value per batch for a given iteration time, for example:

1000 ms at batch 8 - 1000 / 8 = 128

1500 ms at batch 10 - 1500 / 10 = 150

In this case running with batch 8 will be feeding model more data in a given time than with batch 10. However the difference is small. If say we want to use batch 12 but we get an OOM - so we disable models_opt_on_gpu it may now look like this:

2300 ms at batch 12 (Optimizer on CPU) - 2300 / 12 = 191 ms which is much longer that 128 ms with batch 8 and iteration time of 1000 ms.

When starting model it's better to go with lower batch size - higher iteration time and then increase it once we disable random warp.

34. Q: How to use pretrained model?

A: Simply download it and put all the files directly into your model folder.

Start training, press any key within 2 seconds after selecting model for training (if you have more in the folder) and device to train with (GPU/CPU) to override model settings and make sure the pretrain option is disabled so that you start proper training, if you leave pretrain options enabled the model will carry on with pretraining. Note that the model will revert iteration count to 0, that's normal behavior for pretrained model, unlike a model that was just trained on random faces without the use of pretrain function.

35. Q: My GPU usage is very low/GPU isn't being used despite selecting GPU for training/merging.

A: It probably is being used but Windows doesn't report just CUDA usage (which is what you should be looking at) but total GPU usage which may be lower (around 5-10%).

To see true CUDA/GPU usage during training (in Windows 10), go into Task Manager -> Performance -> Select GPU -> Change one of the 4 smaller graphs to CUD.

If you are using different version of Windows - download external monitoring software such as HWmonitor or GPU-Z or look at the VRAM usage which should be close to maximum during training.

35. Q: Training freezes on RTX 3090, RTX cards training crashing, model not training with AdaBelief, model stops training after 12.16 16.12 update.

A: Make sure to enable hardware accelerate GPU scheduling in Windows 10

36. Q: How to train with AdaBelief enabled?

A: Same as you would do without it enabled except without using learning rate dropout which is not needed with AB enabled.

Whether you're starting to pretrain a new model, starting training on a pretrained model or want to carry on training with old model but use AdaBelief to improve quality of your fakes enable it and remember to never turn it off once it's enabled.

Don't use LRD, this option should disable itself but just to be sure run your model once, disable LRD, save and then start it again, select Y to enable AdaBelief optimizer and pretrain or train as usual.

There is no need to enable LRD before GAN, once you've disabled RW and are ready to start GAN simply enable it, no need to run LRD and wait for the model to reach lower loss, AB will ensure the model trains more accurately and naturally reaches lower loss values.

If it's a pretrained model or one that was heavily trained before and you want to use it with AB make sure to enable it and RW, let the model relearn everything, if you instead just enable AB and carry on with RW disabled or some other options enabled it may not improve or it will get worse.

37. Q: When do I delete inter_ab and inter_b when using/reusing LIAE models?

A: If you are trying to create RTM models please reffer to the RTM training step in the guide post above.

Delete inter_ab when resuing LIAE model to create a new RTM model (new SRC, random DST)
Delete inter_b when you want to use RTM LIAE model as regular model (same SRC, new DST of one person).
Do not delete either inter_ab or inter_b if you want to do extra training on target DST with RTM model, just replace DST and start training (iperov way: RW disabled, LRD enabled, UY enabled, after some time enable GAN, TMBDF way: RW disabled, LRD disabled, UY disabled, after some time run EMP for a bit, then disable EMP, run UY for a bit, then enable LRD, then GAN).

TIPS:

1. Generating preview images into the model folder like on Colab (tip from Lazierav)

You can generate previews images as a file into your model folder as when training Colab by modifying few lines of code, this is useful if you are training in virtual enviroments that run in the browser and do not allow new windows so you can check how faces are lookign during training.

To do this you have to open the following file:

<DeepFaceLab folder>\_internal\DeepFaceLab\core\interact\interact.py

And in lines 21 and 22, where it says:

except:

is_colab = False

Change "False" to "True".

XSeg FAQ

1.Q: Can XSeg models be reused?

A: Yes.

2.Q: How to train a generic XSeg model.

A: Simply collect enough of different, labeled faces that cover as many different angles, expressions, lighting conditions and feature wide range of differently shaped obstructions and include it inside of your SRC and DST "aligned" folder, then simply start training your model.

You can start with someone elses XSeg model and add your own faces, you can mix your own faces with faces labeled by others as long as they are using the same facetype or aproach to creating those labels (commonly called masks, but masks are generated by XSeg for trainign and merging, user creates labels that tells the model how to generate masks, it learns to make them from labels/polygons you draw).

You can also pretrain you XSeg model with the built-in pretrain option and then do regular trainign as explained above.

3.Q: Is there a limit to diversity and/or amount of marked/masked faces that we feed into XSeg training?

A: In theory no but in reality too many faces may create too much variation for model to sufficiently learn, you can get pretty good results with 1000 faces but if you want really good model that handles many obstructions you should train it with about 1500-3000 faces.

4.Q: What should I train first? SAEHD/AMP or XSeg model?

A: It depends what kind of model you are training.

For full face (FF) you don't need XSeg but you still need it for the merging process to exclude obstructions, having XSeg even with FF improve the results since model has more precise definition of what face is and what is background, XSeg is required for WF, HEAD face type and if you want to to use face and background style power or blur our mask you must apply masks to your datasets before training your face swapping model, in that case you do need to train XSeg model. If you are using WF face type and just starting out I recommend to use the generic XSeg model DFL comes with, otherwise you can use user shared models or prepare your own.

5.Q: Is XSeg available for Quick96 model too?

A: No, it's only for SAEHD model.

6.Q: XSeg doesn't let me change face type on startup.

A: Just like with SAEHD and AMP models there are some options you can only select when training a model for the first time, face type can't be changed, only batch size.

7.Q: Does XSeg model/model files have to be in the same folder as regular model/model files?

A: Yes, XSeg model files are created in the same model file as regular model files, they have XSeg in the name so they are easy to distinguish from regular model files.

8.Q: After running XSeg labeling tool I see UI but no frames.

A: Make sure your faceset isn't packed into .pak file, if it is unpack it using 4.2) data_src(dst) util faceset unpack.

9.Q: Is XSeg just for whole face or also for head face type?

A: XSeg works with all face types, including full face, whole face and head face_types.

10.Q: What's the difference between 3 polygon color schemes in the XSeg labeling/marking/masking tool?

A: There is no difference between them, there are 3 options so that you can use one that suits you the most by being the most visible on the face you're labeling.

11.Q: I'm getting errors, XSeg won't start training, I'm getting an OOM error when training XSeg.

A: Make sure you are running correct version of DFL for your GPU and that it's also up-to-date, has no known issues with XSeg training (check user reports on forum and DFL github page), make sure you GPU is compatible, that you have newest drivers for your GPU, Windows is up to date and you have no other issues with your PC and your datasets are valid.

OOM error - means batch size is too high so reduce it.

If the model stopped training after DFL update - report bug on github, if after driver update - revert to previous driver (same with Windows updates)

If you get non DFL image error - your dataset is bad, probably missing landmarks (metadata) after upscaling faces, editing them with image editing software (read about metadata, save and restore).

Other errors - google them, search for them on the forum or report on github (but only after you check all other solutions).

12.Q: How to some part of the face to remain unchanged, how to keep parts of the DST face visible, how to not train certain parts of the face.

A: Treat those parts as obstructions and simply do not include them in the main mask or exclude them.

13.Q: Can you use FF XSeg model on WF dataset or WF XSeg model on FF dataset? Can you use FF XSeg for WF SAEHD/AMP model or WF XSeg for FF SAEHD/AMP model?

A: Yes you can, however remember that WF XSeg model on FF dataset (or WF dataset fed into FF SAEHD/AMP model) will create straight lines that willl be harder to make invisible during merging because FF dataset or model has smaller coverage and thus can't generate face bigger than coverage of the lower face type, on the other hand lower face type XSeg on higher face type dataset or SAEHD/AMP model is completely fine however the masked area will only be as big as the XSeg allows.

14.Q: I have holes in my masks.

A: Locate faces with the holes and label them, if they are already correctly labeled and despite additional training don't disappear, check other labels, see if there aren't any unwanted or not pricesly enough made exclusion and either remove them or fix them, if the issue continues to occur on some faces, train it from scratch and consider pretraining your model first or use existing, trained model that was trained on similarly labeled faces.

TMBDF · Apr 3, 2020

Reserved for future use

TMBDF · Apr 13, 2020

- reserved for future use -

iperov · Apr 13, 2020

My advices, translated using deepl.com

SAEHD model options.

Random_flip

Turn the image from left to right by random rotation. Allows for better generalization of faces. Slows down training slightly until a clear face is achieved. If both src and dst face sets are quite diverse, this option is not useful. You can turn it off after a workout.

Batch_size

Improves facial generalization, especially useful at an early stage. But it increases the time until a clear face is achieved. Increases memory usage. In terms of quality of the final fairy, the higher the value, the better. It's not worth putting it below 4.

Resolution.

At first glance, the more the better. However, if the face in the frame is small, there is no point in choosing a large resolution. By increasing the resolution, the training time increases. For face_type=wf, more resolution is required, because the coverage of the face is larger, thus the details of the face are reduced. For wf it makes no sense to choose less than 224.

Face_type.

Face coverage in training. The more facial area is covered, the more plausible the result will be.

The whole_face allows covering the area below the chin and forehead. However, there is no automatic removal of the mask with the forehead, so XSeg is required for the merge, either in Davinci Resolve or Adobe After Effects.

Archi.

Liae makes more morph under dst face, but src face in it will still be recognized.

Df allows you to make the most believable face, but requires more manual work to collect a good variety of src facets and a final color matching.

The effectiveness of hd architectures has not been proven at this time. The Hd architectures were designed to better smooth the subpixel transition of the face at micro displacements, but the micro shake is also eliminated at df, see below.

Ae_dims.

Dimensions of the main brain of the network, which is responsible for generating facial expressions created in the encoder and for supplying a variety of code to the decoder.

E_dims.

The dimensions of the encoder network that are responsible for face detection and further recognition. When these dimensions are not enough, and the facial chips are too diverse, then we have to sacrifice non-standard cases, those that are as much as possible different from the general cases, thus reducing their quality.

D_dims.

The network dimensions of the decoder, which are responsible for generating the image from the code obtained from the brain of the network. When these dimensions are not enough, and the weekend faces are too different in color, lighting, etc., you have to sacrifice the maximum allowed sharpness.

D_mask_dims.

Dimensions of the mask decoder network, which are responsible for forming the mask image.

16-22 is the normal value for a fake without an edited mask in XSeg editor.

At the moment there is no experimentally proven data that would indicate which values are better. All we know is that if you put really low values, the error curve will reach the plateau quickly enough and the face will not reach clarity.

Masked_training. (only for whole_face).

Enabled (default) - trains only the area inside the face mask, and anything outside that area is ignored. Allows the net to focus on the face only, thus speeding up facial training and facial expressions.

When the face is sufficiently trained, you can disable this option, then everything outside the face - the forehead, part of the hair, background - will be trained.

Eyes_prio.

Set a higher priority for image reconstruction in the eye area. Thus improving the generalization and comparison of the eyes of two faces. Increases iteration time.

Lr_dropout.

Include only when the face is already sufficiently trained. Enhance facial detail and improve subpixel facial transitions to reduce shake.

Spends more video memory. So when selecting a network configuration for your graphics card, consider enabling this option.

Random_warp.

Turn it off only when your face is already sufficiently trained. Allows you to improve facial detail and subpixel transitions of facial features, reducing shake.

GAN_power.

Allows for improved facial detail. Include only when the face is already sufficiently trained. Requires more memory, greatly increases iteration time.

The work is based on the generative and adversarial principle. At first, you will see artifacts in areas that do not match the clarity of the target image, such as teeth, eye edges, etc. So train long enough.

True_face_power.

Experimental option. You don't have to turn it on. Adjusts the predicted face to src in the most "hard way". Artifacts and incorrect light transfer from dst may appear.

Face_style_power.

Adjusts the color distribution of the predicted face in the area inside the mask to dst. Artefacts may appear. The face may become more like dst. The model may collapse.

Start at 0.0001 and watch the changes in preview_history, turn on the backup every hour.

Bg_style_power.

Trains the area in the predicted face outside the face mask to be equal to the same area in the dst face. In this way the predicted face is similar to the morph in dst face with already less recognizable facial src features.

The Face_style_power and Bg_style_power must work in pairs to make the complexion fit to dst and the background take from dst. Morph allows you to get rid of many problems with color and face matching, but at the expense of recognition in it src face.

ct_mode.

It is used to fit the average color distribution of a face set src to dst. Unlike Face_style_power is a safer way, but not the fact that you get an identical color transfer. Try each one, look at the preview history which one is closer to dst and train on it.

Clipgrad.

It reduces the chance of a model collapse to almost zero. Model collapse occurs when artifacts appear or when the windows of the predicted faces are colored in the same color. Model collapse can occur when using some options or when there is not enough variety of face sets dst.

Therefore, it is best to use autobackup every 2-4 hours, and if collapse occurs, roll back and turn on clipgrad. .

Pretrain.

Engage model pre-training. Performed by 24 thousand people prepared in advance. Using the pre-trained model you accelerate the training of any fairy.

It is recommended to train as long as possible. 1-2 days is good. 2 weeks is perfect. At the end of the pre-training, save the model files for later use. Switch off the option and train as usual.

You can and should share your pre-trained model in the community.

Size of src and dst face set.

The problem with a large number of src images is repetitive faces, which will play little role. Therefore, faces with rare angles will train less frequently, which has a bad effect on quality. Therefore, 3000-4000 faces are optimal for src facial recruitment. If you have more than 5000 faces, sort by best into fewer faces. Sorting will select from the optimal ratio of angles and color variety.

The same logic is true for dst. But dst is footage from video, each of which must be well trained to be identified by the neural network when it is closer. So if you have too many faces in dst, from 3000 and more, it is optimal to make their backup, then sort by best in 3000, train the network to say 100.000 iterations, then return the original number of dst faces and train further until the optimal result is achieved.

How to get lighting similar to dst face?

It's about lighting, not color matching. It's just about collecting a more diverse src set of faces.

How to suppress color flickering in DF model?

If the src set of faces contains a variety of make-up, it can lead to color shimmering DF model. Option: At the end of your training, leave at least 150 faces of the same makeup and train for several hours.

How else can you adjust the color of the predicted face to dst?

If nothing fits automatically, use the video editor and glue the faces in it. With the video editor, you get a lot more freedom to note colors.

How to make a face look more like src?

1. Use DF architecture.

2. Use a similar face shape in dst.

[align=left]3 It is known that a large color variety of facial src decreases facial resemblance, because a neural network essentially interpolates the face from what it has seen.

For example, in your src set of faces from 7 different color scenes, and the sum of faces is only 1500, so under each dst scene will be used 1500 / 7 faces, which is 7 times poorer than if you use 1500 faces of one scene. As a result, the predicted face will be very different from the src.

Microquake the predicted face in the end video.

The higher the resolution of the model, the longer it needs to be trained to suppress the micro-shake.

You should also enable lr_dropout and disable random_warp after 200-300k iterations at batch_size 8.

It is not rare that the microshake can appear if the dst video is too clear. It is difficult for a neural network to distinguish unambiguous information about a face when it is overflowed with micro-pixel noise. Therefore, after extracting frames from dst video, before extracting faces, you can pass through the frames with the noise filter denoise data_dst images.bat. This filter will remove temporal noise.

Also, ae_dims magnification may suppress the microshock.

Use a quick model to check the generalization of facial features.

If you're thinking of a higher resolution fake, start by running at least a few hours at resolution 96. This will help identify facial generalization problems and correct facial sets.

Examples of such problems:

1. Non-closing eyes/mouth - no closed eyes/mouth in src.

2. wrong face rotation - not enough faces with different turns in both src and dst face sets.

[/align]

Training algorithm for achieving high definition.

1. use -ud model

2. train, say, up to 300k.

3. enable learning rate dropout for 100k

4. disable random warp for 50k.

5. enable gan

Do not use training GPU for video output.

This can reduce performance, reduce the amount of free GPU video memory, and in some cases lead to OOM errors.

Buy a second cheap video card such as GT 730 or a similar, use it for video output.

There is also an option to use the built-in GPU in Intel processors. To do this, activate it in BIOS, install drivers, connect the monitor to the motherboard.

Using Multi-GPU.

Multi-GPU can improve the quality of the fake. In some cases, it can also accelerate training.

Choose identical GPU models, otherwise the fast model will wait for the slow model, thus you will not get the acceleration.

Working Principle: batch_size is divided into each GPU. Accordingly, you either get the acceleration due to the fact that less work is allocated to each GPU, or you increase batch_size by the number of GPUs, increasing the quality of the fairy.

In some cases, disabling the model_opts_on_gpu can speed up your training when using 4 or more GPUs.

As the number of samples increases, the load on the CPU to generate samples increases. Therefore it is recommended to use the latest generation CPU and memory.

NVLink, SLI mot working and not used. Moreover, the SLI enabled may cause errors.

Factors that reduce fairy success.

1. Big face in the frame.

2. Side lights. Transitions lighting. Color lighting.

3. not a diverse set of dst faces.

For example, you train a faceake, where the whole set of dst faces is a one-way turned head. Generating faces in this case can be bad. The solution: extract additional faces of the same actor, train them well enough, then leave only the target faces in dst.

Factors that increase the success of the fairy.

1. Variety of src faces: different angles including side faces. Variety of lighting.

Other.

In 2018, when fairies first appeared, people liked any lousy quality of fairies, where the face glimpsed, and was barely like a target celebrity. Now, even in a technically perfect replacement using a parodist similar to the target celebrity, the viral video effect may not be present at all. Popular youtube channels specializing in dipfeikas are constantly inventing something new to keep the audience. If you have watched and watched a lot of movies, know all the memo videos, you can probably come up with great ideas for dipfeik. A good idea is 50% success. The technical quality can be increased through practice.

Not all celebrity couples can be well used for a dipfeike. If the size of the skulls is significantly different, the similarity of the result will be extremely low. With experience dipfeik should understand what will be good fairies and what not.

Deepfake tutorial XSeg + Whole Face:

[align=left][video=youtube]

TMBDF · Apr 14, 2020

- reserved for future use -

Jessica2020 · Apr 18, 2020

Thank you so very much for the updated guide and tutorial - now i understand a lot more and have been able to alter my training accordingly.

androsk · Apr 19, 2020

is there any other method for download beside MEGA?

TMBDF · Apr 20, 2020

androsk said:
is there any other method for download beside MEGA?

No, mega is the only way right now.

Putin_v · Apr 23, 2020

I am having trouble with pretraining on dfl 2.0. I enable pretraining in SAE and let it run i noticed before that it would stop after a certain ammount of iterations but now it keeps going way past 100,000. i have reinstalled the program multiple times but nothing seems to help. Am i mssing something.

Groggy4 · Apr 23, 2020

Putin_v said:
I am having trouble with pretraining on dfl 2.0. I enable pretraining in SAE and let it run i noticed before that it would stop after a certain ammount of iterations but now it keeps going way past 100,000. i have reinstalled the program multiple times but nothing seems to help. Am i mssing something.

It won't stop unless you set a targeted iteration limit.

Putin_v · Apr 24, 2020

Groggy4 said:
Putin_v said:

I am having trouble with pretraining on dfl 2.0. I enable pretraining in SAE and let it run i noticed before that it would stop after a certain ammount of iterations but now it keeps going way past 100,000. i have reinstalled the program multiple times but nothing seems to help. Am i mssing something.

Click to expand...

It won't stop unless you set a targeted iteration limit.

I tried that. i set a limit and when the limit was done i turned pretraning off. All of the itirations were gone. it started back from 0.

Groggy4 · Apr 25, 2020

Putin_v said:
Groggy4 said:

Putin_v said:

I am having trouble with pretraining on dfl 2.0. I enable pretraining in SAE and let it run i noticed before that it would stop after a certain ammount of iterations but now it keeps going way past 100,000. i have reinstalled the program multiple times but nothing seems to help. Am i mssing something.

Click to expand...

It won't stop unless you set a targeted iteration limit.

Click to expand...

I tried that. i set a limit and when the limit was done i turned pretraning off. All of the itirations were gone. it started back from 0.

It's supposed to work like that. The training data is still there, but to avoid a morphing effect from previously faces, it resets some aspects.

[deleted] · Apr 27, 2020

.

TMBDF · Apr 27, 2020

you don't download DFL base .bat files from github, that's for getting updated files, to get actual DFL with all the .bat files download it from mega.nz link.

positiveraisin2 · Apr 28, 2020

vry helpful

Weapon2057 · Apr 28, 2020

For the DST do the Aligned and De bugged aligned work together? Can I delete photos that I don't like about the DST face like blurry, cut off etc or will it affect me de bug aligned and I will be missing a face for every frame that blurry face was assosiated with?

TMBDF · Apr 29, 2020

Weapon2057 said:
For the DST do the Aligned and De bugged aligned work together? Can I delete photos that I don't like about the DST face like blurry, cut off etc or will it affect me de bug aligned and I will be missing a face for every frame that blurry face was assosiated with?

Read the guide again and then do this one: //deep.whitecatchel.ru/literotica/forums/thre...set-creation-how-to-create-celebrity-facesets
aligned_debug is for checking landmarks only, it isn't used in training. Keep the blurry ones (in the aligned) or else it won't swap, same goes for cut off, use algined_debug to see if they are correctly aligned, if not delete that frame from debug and run 5) data_dst faceset MANUAL RE-EXTRACT DELETED ALIGNED_DEBUG to reextract it manually.

JohnNotStamos · Apr 30, 2020

For merging, since there's 2 option. interactive and non interactive, what settings would best be used when using non interactive? I'm asking because I spent 10 hours across 2 days using interactive so I prefer finding an outcome that doesn't involve me manually doing it for multiple hours.

[GUIDE] - DeepFaceLab 2.0 Guide

Trained Models

TMBDF

Moderator | Deepfake Creator | Guide maintainer

TMBDF

Moderator | Deepfake Creator | Guide maintainer

TMBDF

Moderator | Deepfake Creator | Guide maintainer

TMBDF

Moderator | Deepfake Creator | Guide maintainer

iperov

Member

TMBDF

Moderator | Deepfake Creator | Guide maintainer

Jessica2020

New member

androsk

Member

TMBDF

Moderator | Deepfake Creator | Guide maintainer

Putin_v

Member

Groggy4

NotSure

Putin_v

Member

Groggy4

NotSure

[deleted]

New member

TMBDF

Moderator | Deepfake Creator | Guide maintainer

positiveraisin2

New member

Weapon2057

New member

TMBDF

Moderator | Deepfake Creator | Guide maintainer

JohnNotStamos

New member