Nicolas Cruz (Universidad de Chile)*; Javier Ruiz-del-Solar (Universidad de Chile)
2020-11-16, 11:10 - 11:40 PST | PheedLoop Session
This paper presents FeatureGan, a methodology to train image translators (generators) using an unpaired image training set. FeatureGan is based on the use of Generative Adversarial Networks (GAN) and has three main novel components: (i) the use of a feature loss to ensure alignment between the input and the generated image, (ii) the use of a feature pyramid discriminator, which uses a tensor composed of features at different levels of abstraction generated by a pre-trained network, and (iii) the introduction of a per class loss to improve the results in the simulation-to-reality task. The main advantage of the proposed methodology when compared to classical approaches is a more stable training process, which includes a higher resilience to common GAN problems such as mode collapse, as well as better and more consistent results. FeatureGan is also fast to train, easy to replicate, and especially suited to be used in simulation-to-reality applications where the generated realistic images allow to close the visual simulation-to-reality gap. As a proof of concept, we show the application of the proposed methodology in soccer robotics, where realistic images are generated in a soccer robotics simulator, and robot and ball detectors are trained using these images and then tested in reality. The same methodology is used to generate realistic images from images rendered in a video game. The realistic images are then used to train a semantic segmentation network.