AI researchers come up with benchmark for natural language processing

A group of researchers dealing with artificial intelligence (AI) have launched a benchmarking platform that measures natural language processing (NLP) capabilities. The group consists of Facebook AI, New York University, DeepMind and the University of Washington.

The platform – SuperGLUE – builds on an older platform called GLUE. The new platform must be a benchmark with comprehensive, human baselines, according to Facebook AI. The platform should be used to measure how well AI can understand and interpret language, according to ZDNet.

SuperGLUE was developed because such AI systems had reached a “ceiling” at various benchmarks. They needed bigger challenges to improve their NLP capabilities.

SuperGLUE

The SuperGLUE benchmark uses new ways to test a range of difficult NLP tasks. These tasks focus on innovations in a number of key areas of machine learning, including sample-efficient, transfer, multitask and self-supervised learning.

SuperGLUE uses Google’s BERT as a performance baseline. The benchmark itself consists of eight tasks, including a test in which plausible alternatives must be chosen (COPA test) and a test for causal reasoning. In this test, the system is given a sentence, after which it must determine the cause or effect of the scaffolding on the basis of two choices.

After performing the benchmark, SuperGLUE provides a summary in the form of a single figure about the AI’s ability to perform various NLP tasks. According to Facebook AI, people can achieve an accuracy of 100 percent in the COPA test, while Google’s BERT only reached 74 percent. So there is still room for improvement.

This news article was automatically translated from Dutch to give Techzine.eu a head start. All news articles after September 1, 2019 are written in native English and NOT translated. All our background stories are written in native English as well. For more information read our launch article.

Top story

Inside TCS’ digital race behind Formula E

The world of Formula E combines technology and speed with sustainability. It's a blend that Tata Consultancy ...

Erik van Klinken June 27, 2025

Whitepapers

AI researchers come up with benchmark for natural language processing

SuperGLUE

Stay tuned, subscribe!

Is English the next programming language? JetBrains’ CEO says no

Zscaler Cellular brings Zero Trust to IoT and OT devices

Domain-specific AI beats general models in business applications

SAS gives data scientists the steering wheel for the AI (agents) era

SAS launches tailor-made AI models for business processes

Tableau Pulse uses generative AI to create data analysis on its own

Snowflake lowers the barrier for building AI apps

Experience Synology’s latest enterprise backup solution

How to choose the right Enterprise Linux platform?

Enhance your data protection strategy for 2025

Strengthen your cybersecurity with DNS best practices

Krijg Volledig Inzicht van Gebruiker tot Cloud met Cisco ThousandEyes

GITEX DIGI_HEALTH 5.0 - Thailand

IT Arena

Innovation Week 2025

Luxembourg Venture Days

Appdevcon