A group of researchers dealing with artificial intelligence (AI) have launched a benchmarking platform that measures natural language processing (NLP) capabilities. The group consists of Facebook AI, New York University, DeepMind and the University of Washington.

The platform – SuperGLUE – builds on an older platform called GLUE. The new platform must be a benchmark with comprehensive, human baselines, according to Facebook AI. The platform should be used to measure how well AI can understand and interpret language, according to ZDNet.

SuperGLUE was developed because such AI systems had reached a “ceiling” at various benchmarks. They needed bigger challenges to improve their NLP capabilities.

SuperGLUE

The SuperGLUE benchmark uses new ways to test a range of difficult NLP tasks. These tasks focus on innovations in a number of key areas of machine learning, including sample-efficient, transfer, multitask and self-supervised learning.

SuperGLUE uses Google’s BERT as a performance baseline. The benchmark itself consists of eight tasks, including a test in which plausible alternatives must be chosen (COPA test) and a test for causal reasoning. In this test, the system is given a sentence, after which it must determine the cause or effect of the scaffolding on the basis of two choices.

After performing the benchmark, SuperGLUE provides a summary in the form of a single figure about the AI’s ability to perform various NLP tasks. According to Facebook AI, people can achieve an accuracy of 100 percent in the COPA test, while Google’s BERT only reached 74 percent. So there is still room for improvement.

This news article was automatically translated from Dutch to give Techzine.eu a head start. All news articles after September 1, 2019 are written in native English and NOT translated. All our background stories are written in native English as well. For more information read our launch article.