The new programming language aims to make it easier to deploy neural networks.
OpenAI this week released Triton, an open source programming language to write highly efficient GPU code for AI workloads.
The company claims that Triton makes it possible to reach peak hardware performance with relatively little effort. And it produces code on par with what an expert could achieve in as few as 25 lines.
Phillippe Tillet presented the first version of Triton two years ago in an academic paper. As part of today’s launch, OpenAI released a significantly upgraded edition dubbed Triton 1.0. The new version boasts optimizations that lend themselves to enterprise machine learning projects.
Neural networks becoming central to AI development
Deep neural networks have recently emerged as an important type of AI model. They are capable of achieving state-of-the-art performance across natural language processing, computer vision, and other domains. The strength of these models lies in their hierarchical structure. This helps generate a large amount of highly parallelizable work well-suited for multicore hardware like GPUs.
Tillet, who now works for OpenAI, detailed the Triton release in a company blog post this week. “Novel research ideas in the field of Deep Learning are generally implemented using a combination of native framework operators,” he writes.
“While convenient, this approach often requires the creation (and/or movement) of many temporary tensors, which can hurt the performance of neural networks at scale. These issues can be mitigated by writing specialized GPU kernels, but doing so can be surprisingly difficult due to the many intricacies of GPU programming,” he adds.
“And, although a variety of systems have recently emerged to make this process easier, we have found them to be either too verbose, lack flexibility or generate code noticeably slower than our hand-tuned baselines.
“This has led us to extend and improve Triton,” Trillet says.