Red Hat launches AI Enterprise for hybrid AI deployments

Red Hat launches AI Enterprise for hybrid AI deployments

Red Hat introduces Red Hat AI Enterprise, an integrated platform for deploying and managing models, agents, and applications in the hybrid cloud. At the same time, version 3.3 of Red Hat AI is now available, with expanded model support and improved hardware integration. The complete offering is designed to help organizations move from fragmented experiments to operational AI.

The new platform aims to bridge the gap between infrastructure and innovation. Red Hat AI Enterprise combines tuning and agentic capabilities with the foundation of Red Hat Enterprise Linux and Red Hat OpenShift. “We are providing the complete stack – from the GPU-accelerated hardware to the models and agents that drive business logic,” says vice president and general manager, AI Business Unit, Joe Fernandes.

The enterprise AI sector is rapidly evolving from simple chat interfaces to autonomous agentic workflows. But according to Red Hat, many organizations remain stuck in the pilot phase due to fragmented tools and inconsistent infrastructure. The company has systematically expanded its AI offerings over the past few months to address these obstacles.

From inferencing to model catalog

Red Hat AI Enterprise integrates high-performance AI inferencing, model tuning, and agent deployment capabilities. The platform uses the vLLM inference engine and the llm-d distributed inference framework for optimized generative AI implementations. It runs on Red Hat OpenShift, which should ensure scalability and consistency in hybrid hardware environments.

Red Hat is collaborating with Nvidia on the new platform. Together, they developed Red Hat AI Factory with Nvidia, which combines Red Hat AI Enterprise with Nvidia AI Enterprise. This should accelerate enterprise AI development and make it scalable. The platform offers flexibility for “any model, any hardware, any environment.”

Version 3.3 expands model ecosystem

Version 3.3 of the Red Hat AI portfolio brings significant updates. The model ecosystem is growing with validated, production-ready compressed versions of Mistral-Large-3, Nemotron-Nano, and Apertus-8B-Instruct via the OpenShift AI Catalog. In addition, users can implement the Ministral 3 and DeepSeek-V3.2 models with sparse attention.

Multimodal improvements also count: 3x faster Whisper processing, geospatial support, and improved EAGLE speculative decoding for agentic workflows. A technology preview of Models-as-a-Service (MaaS) provides IT teams with self-service access to privately hosted models via an API gateway. This centralizes AI access internally and promises an AI foundation that enables private and scalable AI adoption within enterprise environments.

On the hardware side, support for generative AI on CPUs is coming, starting with Intel processors for cost-effective small-language-model inference. Certifications for Nvidia Blackwell Ultra and AMD MI325X accelerators have also been added. Red Hat emphasizes the importance of choosing among different hardware platforms for organizations still developing their AI strategy.

From data to production

The new Red Hat AI Python Index plays a central role. This trusted repository provides hardened, enterprise-grade versions of tools such as Docling, SDG Hub, and Training Hub. Teams can use it to move from fragmented experiments to security-focused production pipelines, according to the company.

This is complemented by extensive observability and safety features. Real-time telemetry provides insight into the health, performance, and behavior of models. This includes llm-d deployments and MaaS cluster and model usage. A technology preview of integrated NeMo Guardrails enables developers to enforce operational safety and alignment.

Finally, organizations gain on-demand access to GPU resources through intelligent orchestration and pooled hardware access. Automatic checkpointing saves the status of long-running training jobs, preventing work loss and making compute costs more predictable in dynamic environments.