Nvidia Took Groq's Founders — Groq Just Raised $650M

After Nvidia paid $20B for a license and poached Groq's founder and president, Groq rebounded: $650M raised, new leadership, 5M+ devs, trillions of tokens weekly.

↻ Published 2026-06-22◷ 4 min readEA

Evgenii Arsentev · PhD

New posts every dayFollow me on TelegramWhere the AI world lives — daily AI news + Claude Code tipsFollow →Free Claude Code courseNo upsells, no cross-sells — nothing to buy here.Start free →

Groq confirmed a $650 million funding round on June 22, and the context matters as much as the number. In December 2025, Nvidia paid roughly $20 billion for a non-exclusive license to Groq's inference technology — and in the same move recruited Groq's founder Jonathan Ross (formerly of Google) and president Sunny Madra. Nvidia then used that license to launch its own inference product, the Nvidia Groq 3 LPX, at its March 2026 GTC conference. Six months later, Groq has raised $650 million, hired a new executive team, and is growing.

The new leadership includes Alan Rice as COO (previously at xAI and Meta), Sinclair Schuller as CTO (co-founder of Nuvalence), and Rakesh Malhotra as CPO (a Microsoft cloud veteran). The company is leaning into what it calls its neocloud business: a cloud inference service running 13 data centers across North America, Europe, the Middle East and Asia-Pacific. The numbers: more than 5 million developers, thousands of AI companies, and trillions of tokens processed each week.

What makes Groq different from other AI cloud providers

Groq's Language Processing Units (LPUs) were designed from scratch for one purpose: running a trained AI model as fast as the hardware allows. Standard cloud AI runs inference on GPUs originally built for training — powerful, but not purpose-built for the serve path. LPUs flip that. The result is that Groq's API has historically returned responses measurably faster than GPU-based alternatives. For most casual use cases the difference is minor. For latency-sensitive tasks — voice assistants, real-time analysis, AI agents that need to call a model mid-task without making a user wait — the gap is real and noticeable.

What the raise signals

The clearest takeaway is that Groq's cloud inference business is working on its own. The company lost its founder and a key executive to the biggest chip company in the world, watched Nvidia build a competing product using its own technology, and still grew its developer base and weekly token volume to where investors put in $650 million. That's not a startup propped up by hype — it's a provider with real usage.

The new leadership team's background — Meta, xAI, Microsoft cloud — points toward a push for broader enterprise access rather than the original hardware sales model. For developers and builders, that means Groq's API is likely to stay available and expand, not wind down.

ℹWhat I'd actually do

If you've never tested Groq's API for speed-sensitive work, now is a reasonable time to try. The API is compatible with the OpenAI format in most cases, so switching usually takes just a few lines. Run the same prompt through your current provider and through Groq and actually measure the latency difference. For voice tools, real-time pipelines, or anything where a model has to respond mid-task, that difference often matters more than which model you use.

#Groq#Nvidia#AI чипы#инференс

Related guides

Author

Evgenii Arsentev

PhD · Chief Product Officer at a tech company

About the author →

Want to actually build this?

Guides explain. The free course transforms — personalized, gamified, and built to get you shipping fast.

◉ Start the free course

← All news

Source: techcrunch.com