Microsoft demonstrates facial analysis in the wild using just synthetic data
However, the paper’s abstract argues, the domain gap between real and synthetic applications has remained a problem, particularly when considering human faces.
To circumvent this issue, researchers have traditionally employed a combination of data mixing, domain adaptation, and domain-adversarial training.
According to Microsoft’s new research, it is possible to synthesize data with minimal domain gap so that facial analysis models trained on synthetic data alone can actually be deployed in the wild.
The new process combines a procedurally-generated parametric 3D face model with a comprehensive library of hand-crafted assets, designed to render training images with high realism and diversity.
“With synthetic data, you can guarantee perfect labels without annotation noise, generate rich labels that are otherwise impossible to label by hand, and have full control over variation and diversity in a data set,” the researchers explained in the paper.
The procedurally constructed synthetic faces are realistic and expressive and are based on an initial face template, which is then scrambled with random expressions and textures.
“Attach random hair and clothing and render the face in a random environment,” the paper reads.
The researchers rendered a training dataset of 100,000 synthetic face images, then evaluated the synthetic data on face analysis tasks, face pausing, and landmark localization.
“The networks we train never saw a single real image,” they explained. “We use[d] label adaptation to minimize human-annotated labels.”
According to the Microsoft team, the main difficulty in the process was to convert the models’ 3D projected jawline into a 2D face outline.
Possible applications for the technique are found in training for non-biometric areas of facial analysis.
“Eye tracking can be a key feature for virtual or augmented reality, but training data is difficult to acquire,” the researchers explained.
The synthetic faces, however, look sufficiently realistic close-up to make it relatively easy for the team to set up a synthetic eye-tracking camera and render training images, the researchers say.
Microsoft confirmed the novel dataset will soon be released, complete with 2D landmark and per-pixel segmentation labels, for non-commercial research purposes.