2 min

Tags in this article

, , ,

It is still complicated for robots to deal with objects that are suddenly lying in a different place than usual. However, a team of researchers from Nvidia has found a way to make it easier to train robots to find that stuff, reports ZDNet.

A robot needs an algorithm for computer vision to find and pick up an object in the real world, even if it is in a different place than usual. This algorithm must be able to identify the 3D position and orientation of an object. This is called the 6-DoF (degrees of freedom) position.


Nvidia’s new algorithm should make this easier. The algorithm was trained with synthetic images, bypassing the complex and labour-intensive process of preparing photos for training. And by using a unique combination of synthetic images, the team was able to train an algorithm that does better than a network that is trained with real images. That’s the first time.

“With synthetic data, we can generate an almost infinite number of images with labels that are basically free,” says Stan Brichfield, a robot researcher at Nvidia. “What we are ultimately trying to do is to make it possible for a robot to learn a new task in a short space of time,” which means that robots must be able to help people in various places, for example in factories, hospitals or at home.

The team can now put a standard RGB camera on a robot and use the algorithm to enable a robot to see, pick up and move images. The network was trained using Nvidia Tesla V100 GPUs at a DGX Station with the cuDNN-accelerated PyTorch. They also used their own plug-in designed by Nvidia for the Unreal Engine to generate the synthetic data.

Synthetic data

In the past, synthetic data wasn’t good enough to train a computer vision algorithm, because the images generated didn’t seem real enough. “Until recently, the trend was to try to produce images that looked more and more realistic,” says Birchfield.

“The problem that researchers discovered was that if they wanted to make images more realistic, they would have to hire artists and spend a lot of time making scenes that looked like the real world. As a result, there was less variety. The more variety there is, the better the algorithm is trained.

This news article was automatically translated from Dutch to give Techzine.eu a head start. All news articles after September 1, 2019 are written in native English and NOT translated. All our background stories are written in native English as well. For more information read our launch article.