inference Archives

Nvidia is working on a chip for AI inferencing with Groq technology

In addition to GPUs that handle the lion's share of AI training, Nvidia wants to introduce a chip for running...

Erik van Klinken March 2, 2026

OpenAI releases GPT-5.3-Codex-Spark, a smaller AI encoding model that generates over 1,000 tokens per second ...

Berry Zwets February 13, 2026

OpenAI is dissatisfied with the speed of Nvidia's AI chips for inference tasks and has been looking for alter...

Berry Zwets February 3, 2026

US AI startup Baseten has raised $300 million in growth capital at a valuation of $5 billion. The investment ...

Mels Dees January 21, 2026

During CES, Nvidia unveiled the Rubin platform, a new generation of AI infrastructure comprising six chips. T...

Berry Zwets January 6, 2026

During Google Cloud Next in Las Vegas, Google unveiled its latest Tensor Processing Unit (TPU): Ironwood. Thi...

Coen van Eenbergen April 9, 2025

Rapt AI and AMD have announced a strategic partnership to optimize AI workloads on AMD Instinct GPUs. This al...

Sander Almekinders March 28, 2025

Nvidia is busy acquiring the young cloud startup Lepton AI. This would be the second acquisition in a short t...

Floris Hulshoff Pol March 28, 2025