Gemini 3.5 Flash Gets Computer Use Built In

Google added computer use to Gemini 3.5 Flash: the AI can now see your screen and control browsers, desktop apps, and mobile interfaces natively through the API.

5 min readEAEvgenii ArsentevEvgenii Arsentev · PhD

Gemini 3.5 Flash — Google's fast, efficient model that developers already use for text, vision, and function-calling tasks — just got the ability to control computers. It can now see what's on your screen and take action: navigating browsers, clicking through mobile UIs, operating desktop software. Until today, this capability lived in a separate Gemini 2.5 model. Now it's built directly into Flash, which means you don't have to switch models to get it.

The real test of any computer-use AI is whether it can handle multi-step tasks that unfold over time, not just single clicks. Google's release shows Gemini 3.5 Flash doing exactly that — running long, continuous workflows across multiple interfaces: opening a browser, filling a form, switching to a spreadsheet, checking a result, looping back. Enterprise automation tasks that currently require dedicated robotic process automation tools now have a direct AI path.

Already running in production

Three companies — Browserbase, Browser Use, and UiPath — have adopted the technology for real applications. Use cases include continuous software testing (clicking through your app the way a user would, automatically), knowledge work automation (reading live web pages, not cached ones), and document-heavy enterprise workflows. This isn't a capability sitting in a research preview; it's shipping into products that people pay for.

How Google handles the safety problem

A computer-controlling AI creates an obvious risk: a webpage or document could inject instructions and redirect the agent to do something the user never intended. Google's defenses work at two levels. The model was trained adversarially to recognize and resist injection attempts — and when it detects one, it stops the task automatically rather than continuing. For sensitive actions like submitting forms or sending messages, the system can require explicit user confirmation before proceeding.

Google also recommends that developers layer additional safeguards on top: run the agent in a sandboxed environment with limited access to the real system, keep a human in the loop for consequential steps, and restrict what the agent is allowed to do via strict permissions. These aren't built in automatically — they're on the developer to implement — but the documentation makes clear they're expected parts of a production setup.

Why this matters for builders

Computer use has been the 'last mile' gap in AI workflows. You could generate text, analyze data, write code — but getting an AI to actually open a browser and do something with the result required stitching in separate automation tools. Gemini 3.5 Flash closing that gap inside a model you might already be using is a practical win. A customer support agent can pull up account details in your CRM on its own. A research tool can read live pages. A QA system can interact with your UI exactly the way a real user would.

Access is available through the Gemini API and the Gemini Enterprise Agent Platform. There's also a demo environment hosted by Browserbase for anyone who wants to try it before integrating.

What I'd actually do

If there's a step in any workflow you're building that currently needs a person to 'just click this,' now is a good time to prototype replacing it with Gemini computer use. Start small: isolate one specific sub-task, run the agent in a sandboxed browser with defined permissions, and keep confirmation prompts on for anything that writes data. The capability is real — the boundaries you set around it are what determine whether it's useful or chaotic.

#gemini#google#computer-use#agents#ai-tools

Related guides

EAEvgenii Arsentev

Author

Evgenii Arsentev

PhD · Chief Product Officer at a tech company

Want to actually build this?

Guides explain. The free course transforms — personalized, gamified, and built to get you shipping fast.

◉ Start the free course

Source: deepmind.google