Nvidia’s TensorRT 8 is here to boost AI inference

Nvidia is accelerating artificial intelligence with the launch of its next generation of TensorRT software. On Tuesday, Nvidia launched the eighth iteration of its popular AI software, used in high-performance deep learning inference.

TensorRT 8 combines a deep learning optimizer fitted with a runtime that gives users low-latency, high-throughput inference for many AI applications.

In AI, ‘inference’ is an important aspect of getting results. Where training refers to the development of the algorithm’s ability to understand datasets, the inference is about its ability to act on the information by inferring answers to specific questions.

Inference needs

The AI world is on an upward trajectory, meaning that the need to infer is going up. To that end, and with the ever-growing amounts of data, AI needs to work faster, which TensorRT 8 promises it can power. Nvidia announced in a blog post that the inference time on the new iterations will be half the current average.

That means it can be used to develop high-performance search engines, ad recommendation systems, chatbots deployed in the cloud or at the network edge.

Some transformer optimizations in TensorRT 8 will, according to Nvidia, deliver record-setting speed for language applications.

Exponential complexity

The problem with AI models currently is that they are growing quite complex, with worldwide demand surging for real-time apps that use AI.

The TensorRT 8 is timely since it brings new capabilities that include, for instance, the ability to run BERT-Large, one of the most widely-used transformer-based models, in 1.2 milliseconds.

Nvidia said that TensorRT 8 is now generally available and will be free to all members of the Nvidia Developer Program. The new versions of the iteration’s plug-ins, parsers, and samples are available through an open-source license via the TensorRT GitHub Repository.

ChatGPT Data Collective gives users control over their data

Critics argue that AI companies exploit user data without permission or compensation. The new ChatGPT Data Co...

Berry Zwets July 2, 2025

Top story

Building on 50 years analytics, SAS charts the future of AI

With close to fifty years of experience, SAS has guided organizations through the major shifts in analytics. ...

Berry Zwets 3 days ago

Mistral launches Voxtral: open-source speech recognition for businesses

Mistral is launching its new Voxtral speech models, designed to serve as an alternative to closed APIs offere...

Berry Zwets July 15, 2025

Top story

Inside TCS’ digital race behind Formula E

The world of Formula E combines technology and speed with sustainability. It's a blend that Tata Consultancy ...

Erik van Klinken June 27, 2025

Expert Talks

Tech calendar

Nvidia’s TensorRT 8 is here to boost AI inference

Inference needs

Exponential complexity

Stay tuned, subscribe!

Storyblok Blueprints, speedier setup for web developers

Replatforming virtualized workloads: Do your VMs need a new home?

AI requires mature choices from companies

Dutch Department of Justice offline after Citrix vulnerability

SAP Sapphire Orlando: Unveiling a new pricing strategy

What is HPE's Unleash AI program and how does it help companies?

HPE’s strategy: AI, smart switches, GreenLake and beyond

What is HPE VM Essentials and is it a direct competitor to VMware?

How AI and automation are redefining ROI in the enterprise

Enhancing video encoding: The AV1 support in the new ARTPEC-9 System-on-Chip

How organisations can remain compliant while building resiliency during the AI era

GITEX DIGI_HEALTH 5.0 - Thailand

IT Arena

Innovation Week 2025

Luxembourg Venture Days

Appdevcon

Webdevcon

Experience Synology’s latest enterprise backup solution

How to choose the right Enterprise Linux platform?

Enhance your data protection strategy for 2025

Strengthen your cybersecurity with DNS best practices