DeepMind, Google’s Artificial Intelligence (AI) department, is busy developing the technology. But after years of research into how AI can become stronger against attacks and fail less quickly, it is not clear what ‘failure’ actually means for AI. DeepMind wonders that now.

The problem is that there is too little human interference in the boundary conditions of how neural networks should function, according to ZDNet. In other words, it is not clear what an AI has to do in order to work properly, and what the conditions are for failure. This is clear in other technical aspects. From a programmer’s perspective, for example, a bug is all behaviour that does not correspond to the intended functionality (specification) of a system.


But what does it actually mean for a neural network to follow the “specification” developed for it? This specification is not even always clear, according to the researchers at DeepMind. “Specifications that include the ‘correct’ behaviour of AI systems are often difficult to define specifically.

There is also the possibility that there is not one specification, but that there are three. Then there is the “ideal” specification, which is what the creators of the system imagine what the AI can do. The “design” specification includes the “objective function”, which is specially optimized for a neural network. And the revealed specification is the way the system actually works. All three of these specifications can be quite different.

So now you can say that an AI fails when there is a mismatch between the “ideal” specification and the “revealed” specification. The AI system does not do what the user wants. In this way, the design of neural networks can be seen as a way to bridge the gap between desire, design and ultimate behaviour.

Reinforcement learning systems

The researchers at DeepMind therefore propose various routes for testing and training neural networks that are stronger against errors and that are expected to meet the specifications better. One way is to use AI yourself to find out what AI’s struggling with. This amounts to using a reinforcement learning system, which was used for example on Google’s AlphaGo, to find the worst way for another reinforcement learning system to fail.

DeepMind did the same in a paper from December last year. “We learn a function with a contradictory value that predicts from experience which situations are most likely to cause errors for the functions. The agent refers to a reinforcement learning agent. “We then use this learned function for optimization to focus the evaluation on the most problematic input.

According to the researchers, the method leads to “major improvements in random testing” of reinforcement learning systems.

This news article was automatically translated from Dutch to give a head start. All news articles after September 1, 2019 are written in native English and NOT translated. All our background stories are written in native English as well. For more information read our launch article.