With the prevalence of Advanced Driver’s Assistance Systems (ADAS) and a surge in interest in autonomous vehicles, it has become important that the computer vision modules that make up these systems understand their natural surroundings and react appropriately to changes. A key aspect to understanding such natural scenes is to identify the locations and bounds of the objects present in the scene. Semantic segmentation is one to way to approach this problem. With the rise of deep learning techniques, there has been a tremendous progress in semantic segmentation with great improvements in quality and performance. However, one down-side of most deep learning methods is the requirement of a large set of annotated data. This becomes very cumbersome when it comes to segmentation problems, since they require pixel level annotations. Another issue that arises is that of a domain gap introduced when deploying a model on data that is different from what it was trained on. In this thesis we tackle the first issue by leveraging a practically unlimited source of annotated in the form of game engines and virtual environments. We then transform the data thus derived to have a more photo-realistic look matching their real-world counterparts, thus aiming to solve the second issue. We describe the process we have employed to transform the synthetic looking images to look as close to the real-world images as possible and show that there are significant gains to be had by adopting such a method.
Advisor
Author