Nvidia’s TensorRT 8 is here to boost AI inference

Nvidia is accelerating artificial intelligence with the launch of its next generation of TensorRT software. On Tuesday, Nvidia launched the eighth iteration of its popular AI software, used in high-performance deep learning inference.

TensorRT 8 combines a deep learning optimizer fitted with a runtime that gives users low-latency, high-throughput inference for many AI applications.

In AI, ‘inference’ is an important aspect of getting results. Where training refers to the development of the algorithm’s ability to understand datasets, the inference is about its ability to act on the information by inferring answers to specific questions.

Inference needs

The AI world is on an upward trajectory, meaning that the need to infer is going up. To that end, and with the ever-growing amounts of data, AI needs to work faster, which TensorRT 8 promises it can power. Nvidia announced in a blog post that the inference time on the new iterations will be half the current average.

That means it can be used to develop high-performance search engines, ad recommendation systems, chatbots deployed in the cloud or at the network edge.

Some transformer optimizations in TensorRT 8 will, according to Nvidia, deliver record-setting speed for language applications.

Exponential complexity

The problem with AI models currently is that they are growing quite complex, with worldwide demand surging for real-time apps that use AI.

The TensorRT 8 is timely since it brings new capabilities that include, for instance, the ability to run BERT-Large, one of the most widely-used transformer-based models, in 1.2 milliseconds.

Nvidia said that TensorRT 8 is now generally available and will be free to all members of the Nvidia Developer Program. The new versions of the iteration’s plug-ins, parsers, and samples are available through an open-source license via the TensorRT GitHub Repository.

Top story

Inside TCS’ digital race behind Formula E

The world of Formula E combines technology and speed with sustainability. It's a blend that Tata Consultancy ...

Erik van Klinken June 27, 2025

Whitepapers

Nvidia’s TensorRT 8 is here to boost AI inference

Inference needs

Exponential complexity

Stay tuned, subscribe!

HPE closes acquisition of Juniper Networks

Yealink delivers secure collaboration with Microsoft’s MDEP

AI only works if the infrastructure is right

E-commerce solutions provider puts its own portfolio on display

Intel and Altera aim to bring AI to edge computing with new series of chips

AI-powered cameras shake up retail

RFID gives optimal insight and overview in both store and warehouse

Experience Synology’s latest enterprise backup solution

How to choose the right Enterprise Linux platform?

Enhance your data protection strategy for 2025

Strengthen your cybersecurity with DNS best practices

GITEX DIGI_HEALTH 5.0 - Thailand

IT Arena

Innovation Week 2025

Luxembourg Venture Days

Appdevcon

Webdevcon