Nvidia's Robots Train Themselves With Coding Agents

Nvidia, CMU and UC Berkeley built ENPIRE, where AI coding agents teach robots to grasp objects almost unsupervised — a fleet of 8 hit up to 99% success on hard tasks.

↻ Published 2026-06-17◷ 5 min readEA

Evgenii Arsentev · PhD

New posts every dayFollow me on TelegramWhere the AI world lives — daily AI news + Claude Code tipsFollow →Free Claude Code courseNo upsells, no cross-sells — nothing to buy here.Start free →

Researchers at Nvidia, Carnegie Mellon University and UC Berkeley built a system called ENPIRE in which robots teach themselves to grasp and manipulate objects with almost no human supervision — and a coordinated fleet of eight reached up to 99% success on demanding tasks. The twist that makes it interesting: the thing doing the teaching isn't a human engineer tuning parameters, it's an AI coding agent that writes and rewrites the robots' training code on its own.

It works in two phases. First, an agent sets up the workspace — safety boundaries, automatic reset between attempts, and automated success checking. Normally a human has to sit there and judge whether each attempt worked; instead, the agent writes its own reward function to tell success from failure, learning what 'good' looks like from only minutes of example video. Second, the agent reads research papers, forms hypotheses, and edits the training code directly, choosing between behavior cloning and reinforcement learning based on what's actually happening on the hardware.

Eight robots that share notes over Git

The setup uses eight dual-arm YAM robot stations that coordinate through Git version control — the same tool software teams use to manage code. Each station tests hypotheses simultaneously and shares successful training recipes across the fleet, so a trick discovered by one robot propagates to the others. The numbers back up the approach. On the 'Push-T' task (sliding a T-shaped block into position), eight agents cut completion time from roughly five hours to two. On pin insertion, the system converged 100% faster than human-in-the-loop methods, dropping from over 90 minutes to about 40. It also handled cable-tie cutting and inserting a GPU into a motherboard slot — fiddly, real-world tasks, not just simulator demos.

Why this matters

We've watched coding agents quietly take over more of software work this past year. ENPIRE is an early sign of the same agents reaching into the physical world — writing the code that teaches machines to do hands-on tasks. The honest caveats are right there in the paper, which I appreciate: real-world performance still trails simulation badly because of unpredictable robot dynamics and friction, the robots sit idle a lot while the agent writes code and summarizes results, and token costs scale faster than the performance gains as the fleet grows. So this is a research result, not a product. But the loop it demonstrates — an AI reading papers, hypothesizing, coding, testing on real hardware, and sharing what worked — is exactly the kind of self-improving setup that gets cheaper and faster every year. Robotics has always been bottlenecked on collecting real-world training data; handing that grind to tireless agents is a plausible way past it.

ℹWhat I'd actually do

Don't read this as 'robots are about to take over the warehouse' — the gap between 99% in a lab and reliable in the wild is enormous. What I'd actually track is the pattern, not the robots: an AI agent that reads the literature, writes its own success criteria, and improves its own code without a human babysitting each step. That loop is the real story, and it's going to show up far beyond robotics. If you build anything, watch for where you can let an agent close its own feedback loop — that's where the compounding gains live.

#ai#nvidia#robotics#agents#research

Related guides

Author

Evgenii Arsentev

PhD · Chief Product Officer at a tech company

About the author →

Want to actually build this?

Guides explain. The free course transforms — personalized, gamified, and built to get you shipping fast.

◉ Start the free course

← All news

Source: the-decoder.com