Gemini's New Default API: Agents, Background Tasks, Media

Google's Interactions API is now Gemini's default. All new agent features — Linux sandboxes, background execution, media generation — ship here only.

4 min readEAEvgenii ArsentevEvgenii Arsentev · PhD

Google DeepMind has made the Interactions API the default interface for Gemini models and agents, moving it from beta — where it first appeared in December 2025 — to general availability in June 2026. It is now the default in Google AI Studio and across Google's developer documentation. The previous standard, the generateContent API, continues to work for existing code and integrations. But Google has drawn a clear line: all new agent capabilities going forward will ship exclusively through the Interactions API. If you are building anything that uses Gemini's newer features, staying on the old API means staying frozen.

The structural difference between the two APIs comes down to how conversations and actions are organized. The generateContent approach used a role-based structure: messages were assigned to 'user' or 'model,' and the back-and-forth was built around those two buckets. The Interactions API uses typed steps instead — every action, whether it is user input, a tool call, a function result, or a model response, is its own discrete step with a defined type. For straightforward chat, the difference is minor. For agent workflows with multiple tools running in sequence, the typed step structure makes the logic significantly easier to describe and extend without tangling.

What the new API unlocks

The capabilities gated behind the Interactions API are not incremental. Managed Agents with Linux sandbox environments give agents access to a real shell — not a simulated tool call, but actual command execution in an isolated environment. Background execution lets long-running tasks keep working after the API call returns: you kick off a job, the agent continues for hours, and you pick up the result when it is done. Tool chaining with Google Search and Maps is built in as native steps rather than an external integration you wire yourself. And media generation — images, music, and speech — is available as a native output type within the same response schema, rather than a separate service call.

There is also a pricing dimension worth knowing. The Interactions API introduces two operating modes. Flex mode runs at 50 percent of the standard compute cost but on lower-priority infrastructure — suitable for background jobs and batch work where latency is not critical. Priority mode runs at full price with maximum throughput and speed. For any agent work that runs unattended overnight, Flex mode could cut compute costs in half with no change in output quality. For real-time agent interactions where a slow response breaks the experience, Priority mode is the intended path.

What I'd actually do

If you are building on Gemini today, the migration from generateContent to the Interactions API is worth doing now. Google published a migration guide in the AI Studio documentation. The two features that matter most for practical builder use — background execution for tasks that run without you babysitting them, and managed agents with real shell access — simply do not exist in the old API. If those are on your roadmap, the new API is the only path. And check the Flex versus Priority pricing split before you deploy: for anything running in batch, the cost difference is real.

#google#gemini#api#developers#agents

Related guides

EAEvgenii Arsentev

Author

Evgenii Arsentev

PhD · Chief Product Officer at a tech company

Want to actually build this?

Guides explain. The free course transforms — personalized, gamified, and built to get you shipping fast.

◉ Start the free course

Source: the-decoder.com