Belgian Textgain develops LLM to detect hate speech

Belgian Textgain develops LLM to detect hate speech

In Belgium, Textgain is building the first AI model capable of detecting hate speech online in all official European languages. The finalised model is expected in the summer of 2025.

Textgain is developing AI that identifies social problems and contributes to their solution. The LLM CaLICO will be the result of that. CaLICO will detect online hate speech in all official European languages.

No reuse of known LLMs

The language model will be developed in Belgium over the next 12 months. Textgain is, in fact, a spin-off of the University of Antwerp.

According to Textgain CEO Guy De Pauw, this affiliation is why the company can develop technologies to address such social problems. “Large language models, certainly the commercial ones, refuse to process toxic language. This makes it virtually impossible to use them to process hate speech, for example. We are now building from scratch our own language model that can process this kind of content but not produce it ourselves.” Herein also lies immediately the reason why Textgain is developing its own LLM and not building products with, for example, the model of OpenAI or Google.

Guy De Pauw, CEO of Textgain.

Hate speech as an online problem

Although commercial companies are not interested in fighting hate speech, according to De Pauw, there is reason to fight it smarter and better. Indeed, online platforms are required within Europe to eliminate the spread of hate speech as quickly as possible. These rules were included in the Digital Services Act. A model such as CaLICO can support these platforms in complying with such legislation.