Sakana's Marlin Agent Runs 8 Hours to Write Reports

Sakana AI launched Marlin, an agent that works unattended for up to eight hours to produce 60–100-page research reports with cited sources and slides.

5 min readEAEvgenii ArsentevEvgenii Arsentev · PhD

Sakana AI has launched Marlin, an enterprise research agent that works unattended for up to about eight hours per task and returns reports of 60 to 100 pages backed by 60 to 80 cited sources, plus presentation slides generated by image AI. The Tokyo lab pitches it as a kind of virtual chief strategy officer: in a single run it fires off hundreds to thousands of model queries while it plans hypotheses, browses sources, and checks its own findings — compressing what would normally be weeks of strategy work into one long session.

Marlin went on sale in June 2026 after a closed beta in April that ran roughly 300 professionals through it. Pricing is metered in credits: one run costs 100 credits at a pay-as-you-go rate of ¥98 per credit, with a Pro tier at ¥150,000 a month (2,000 credits), a Team tier at ¥400,000 a month (6,000 credits), and custom enterprise pricing on top.

Why it can run for hours, not seconds

The engine is a search method called AB-MCTS — Adaptive Branching Monte Carlo Tree Search. Instead of answering once, it treats reasoning as a tree: at each step it decides whether to go wider by generating a fresh candidate answer, or deeper by refining a promising one it already has. A multi-model variant routes different steps to different LLMs — Sakana cites o4-mini, Gemini 2.5 Pro and DeepSeek-R1 — and reports solving about 27.5% of the hard ARC-AGI-2 benchmark versus 23% for o4-mini alone. The search algorithm itself is open source as TreeQuest under the Apache 2.0 license, so you don't need Marlin to use the idea.

Why this matters to you

Most AI tools you've used answer in seconds and expect you to steer every turn. Marlin is part of a different category that's arriving fast: agents you hand a goal and walk away from for hours. That changes the unit of work from a chat message to a deliverable, and it changes the bottleneck from 'can the model write?' to 'can you trust eighty pages you didn't watch it produce?' The honest answer today is: not blindly. We've already seen a major firm pull an AI-written report after the facts turned out to be invented.

My take after living with long-running agents: the eight-hour run is real and genuinely useful for a first draft, but the value lands only if you treat the output as raw material. The 60–80 cited sources are the gift here — they're the part you can actually verify. Skip that step and you've just automated the production of confident, well-formatted mistakes.

!What I'd actually do

Use a long-running agent for the legwork, not the verdict. Before anyone acts on a Marlin-style report, click through the cited sources for every claim you'd stake a decision on, and never forward a document you haven't read end to end just because it looks finished.

#agents#sakana#research

Related guides

EAEvgenii Arsentev

Author

Evgenii Arsentev

PhD · Chief Product Officer at a tech company

Want to actually build this?

Guides explain. The free course transforms — personalized, gamified, and built to get you shipping fast.

◉ Start the free course

Source: marktechpost.com