Google has developed a technique for fun training of natural language processing (NLP) open source. It’s about BERT. This should allow developers to train an NLP model on a single Cloud TPU within 30 minutes or on a GPU within a few hours.

NLP is a component of Artificial Intelligence (AI), which includes translation, sentiment analysis, semantic searches and other language-related tasks. However, the training must be done with large datasets, which makes it a challenge for researchers. A popular solution is fun training, which refines general language models trained on unlabeled texts to perform specific tasks.


Google does this itself with its Biderectional Encoder Representations from Transformers, also called BERT. This technique has now been made open source and is available on Github. In engineering, pre-trained language representation models are in English and source code built on top of the TensorFlow machine learning framework.

According to Google itself, BERT is unique because it provides access to the context of the past and present tense. It can also use data that has no classification or label. Common NLP models generate a single context-free mathematical representation of words, for each word in their vocabulary.

BERT can create relationships between sentences by being trained in advance on a task that can be generated from each corpus. The whole is built on Google’s Transformer, an open source neural network architecture based on a self-attention mechanism optimized for NLP.

Other tools

Earlier, Google made AdaNet available, an open source tool to combine machine learning algorithms to gain better insights into predictions. ActiveQA, a research project investigating the use of reinforcement learning to train AI agencies to provide answers to questions, also became available.

This news article was automatically translated from Dutch to give a head start. All news articles after September 1, 2019 are written in native English and NOT translated. All our background stories are written in native English as well. For more information read our launch article.