Claude Code Fast Mode: What /fast Really Costs
Claude Code fast mode makes Opus up to 2.5x faster — for a price. How the /fast command works, what it costs, and when the speed is worth paying for.
Fast mode is a switch in Claude Code that makes the Opus model respond up to 2.5 times faster — same model, same quality, higher price per token. You toggle it with the /fast command when latency matters more than cost, and toggle it off when cost matters more. That's the whole feature; the interesting part is when it's worth it.
A heads-up before we start: fast mode is a research preview, which the docs translate as 'the feature, pricing, and availability may change'. Everything below is accurate as of June 2026.
What is fast mode — a different, dumber model?
No — and this is the most common misconception. Fast mode is not a different model. It's the same Claude Opus running on an API configuration that prioritizes speed over cost efficiency. You get identical quality and capabilities, just faster. The analogy I use: same chef, same dish, but you paid for the express kitchen lane. It works on Opus 4.8, 4.7 and 4.6 — not on Sonnet or Haiku.
How do you turn fast mode on?
One command. If you're on a different model when you enable it, Claude Code automatically switches you to Opus. While it's active, a small lightning icon (↯) sits next to your prompt so you always know the meter is running.
/fast
Type /fast and press Tab to toggle on or off. Run it again anytime to check the current state. You can also set fastMode to true in your settings file to have it persist. Note: fast mode lives in the CLI — the VS Code extension doesn't support it.
you ▸ /fast claude ▸ Fast mode ON ↯ ready — Opus responses now up to 2.5x faster
How much does fast mode cost?
Per million tokens, fast mode runs $10 input / $50 output on Opus 4.8, and $30 / $150 on Opus 4.7 and 4.6 — noticeably above standard Opus rates. Two fine-print items matter more than the headline numbers. First: on subscription plans (Pro/Max/Team/Enterprise), fast mode is billed through usage credits only — it does not draw from your plan's included usage, even if you have plenty left, so you need usage credits turned on in billing settings. Second: the first time you enable it in a conversation, you pay the full fast-mode input price for the entire conversation context accumulated so far. The deeper into a chat you are, the bigger that one-time bill.
Because that first toggle re-bills your whole conversation context at fast-mode rates, switching it on two hours into a session is the most expensive way to use it. If you know the session will be interactive and time-sensitive, turn /fast on with your very first message. Toggling off and on again later doesn't repeat the charge within the same conversation.
Two more facts from the docs worth knowing. Fast mode requires Claude Code v2.1.36 or later — check yours with claude --version. And on Team and Enterprise plans it's disabled by default until an organization admin turns it on; if /fast replies that fast mode has been disabled by your organization, that's policy, not a bug.
How else can you speed up Claude Code?
Fast mode buys raw response speed, but it's not the only dial. A lower effort level (/effort) makes the model think less — faster and cheaper, with some risk on complex tasks. Simple chores run quicker on Haiku (/model haiku) than on any accelerated Opus. And a clean context window — /clear between unrelated tasks — speeds everything up for free. The docs even bless the combo: fast mode plus low effort is maximum speed for straightforward work.
Quick decision checklist
- 1Live debugging, rapid back-and-forth iteration, deadline tonight → /fast on, from the first message.
- 2Long autonomous task you'll check on later → fast mode off; you won't notice the latency anyway.
- 3Routine small stuff → don't accelerate Opus, just downshift to Haiku.
- 4Watching costs this month → leave it off; standard Opus is the same brain at a calmer price.
- 5Hit the fast-mode rate limit? It falls back to standard speed automatically — the ↯ icon turns gray, then re-enables itself after the cooldown.
So is fast mode worth it?
When you're in a tight loop with the AI — try, look, correct, try again — latency is the tax on every iteration, and fast mode genuinely changes how the session feels. When Claude is grinding through a big task on its own, you're paying a premium for speed nobody is watching. My rule: fast mode is for conversations, standard mode is for assignments. Next time you and Claude are debugging something live and you catch yourself drumming fingers on the desk — that's the moment /fast was built for. Try it once at the start of such a session and decide with your own wallet.

Author
Evgeny Arsentyev
PhD · Chief Product Officer at a healthtech company
▌ Reading is the blue pill
Want to actually build this?
Guides explain. The free course transforms — personalized, gamified, and built to get you shipping fast.
◉ Take the red pill →