Google researchers have developed LaserTagger, an open source model that predicts a sequence of text edits. In this way, a source text can be converted into a target text. Google claims that LaserTagger deals with text generation in a way that is less prone to malfunctions. The model is also easier to train and can be executed faster.
The release of LaserTagger follows a number of other Google contributions in the field of language processing and language comprehension. This week, the tech giant showed Meena, a neural network with 2.6 billion parameters that can handle multi-turn dialog. Earlier this month, Google also published a paper describing Reformer, a model that can process entire novels for translation and text generation.
Overlap between input and output
LaserTagger takes advantage of the fact that for many text generation tasks, there is often an overlap between input and output. For example, when detecting and repairing grammatical errors or when merging multiple sentences, most of the input text can remain unchanged. In other words, only a small part of the words needs to be changed. LaserTagger then performs a series of operations instead of placing actual words, such as the ‘keep’ command (which copies an original word to the output), ‘delete’ (which deletes a word), and ‘keep-addx’ or ‘delete-addx’ (which adds a phrase for a tagged word, and optionally deletes the tagged word).
The added sentences originate from a relatively limited vocabulary, which is optimised. In this way, the size of the vocabulary can be minimised, while the number of training examples is maximised. So, the words needed for the target text only come from that vocabulary, which prevents the model from adding random words. This reduces the problem of ‘hallucination’ (producing output that does not match the input text). In addition, LaserTagger is able to predict operations with high accuracy, allowing an acceleration of the entire process, compared to models that make sequential predictions.