Spend Less, Build More

A few free habits that make your plan go further — and what that 'cache' email actually means.

By the end of this lesson

You'll know the handful of habits that cut your AI spend without cutting quality.

By now you build real things, and a fair, grown-up question shows up: am I overpaying? For most people on a normal subscription the honest answer is no — but there are a few habits, all free, that make your plan last noticeably longer, and that cut the bill a LOT if you ever move to paying per use. None of them lowers quality. Several actually improve it.

ℹFirst, the two ways you pay

There are two separate doors. (1) A SUBSCRIPTION (Claude Pro/Max) — one flat monthly fee, with usage that refills every few hours; this is what the course set you up with and where most people happily stay. (2) The API / 'Console' — pay-as-you-go, billed per token, the door you open only for automation or tools that call Claude on their own (Module 8's 'second bill'). The habits below help both, but it's the per-token door where they turn into real money.

Here are the levers, in order of how much they matter. Notice none of them is a hidden setting — they're just how you work.

Lever 1 — one full request beats twenty little ones

Every time you press Enter, the AI re-reads the whole conversation so far before it answers. So twenty tiny back-and-forths re-read that growing pile twenty times; one clear, complete request reads it once. Picture a taxi meter that restarts the trip from your house on every message — fewer, fuller trips cost less and arrive sooner. (Same reason the install lesson warned you off drip-feeding twenty little messages.)

Lever 2 — keep each conversation focused

A long, wandering chat drags all its old baggage into every new answer — slower, pricier, and more easily confused. When you move to a different task, start clean: type /clear (or /exit, then run claude again) and begin fresh. A focused conversation is cheaper AND smarter. You met this as a fix-it move; it's a money move too.

Lever 3 — point it straight at the work. 'Fix the delete button in budget.html' sends Claude right there; 'something's broken somewhere' makes it wander your whole project, opening files it never needed. Being specific isn't just manners — every needless step it takes is work you pay for. (A short CLAUDE.md note describing your project, from the Skills lesson, saves it re-discovering the basics on every run.)

✕So what was that 'your prompt cache hit rate is low' email?

Some people get a note from Anthropic: “your prompt cache hit rate is low — caching repeated content like system prompts could save up to ~24%.” In plain words, 'caching' means the system keeps the stable part of your context warm and reuses it instead of re-reading it from scratch — which is cheaper. Two calm facts. First: that email is about the per-token API/Console door, not your Pro subscription — if you're on a subscription you can simply ignore it. Second, the part people miss: Claude Code already turns caching ON for you, automatically. There is no switch to flip and nothing to configure.

✓How you actually 'use caching' — for free

You don't set it up; you feed it. Caching pays off when the stable part of your context stays the same between messages — which is exactly what Levers 2 and 3 give it: focused sessions and a steady CLAUDE.md instead of starting cold every time. So the habits that make you faster and clearer are the very same ones that make caching work. Nice when the lazy way and the cheap way are the same way.

!If you automate (the per-token crowd)

Once you own a tool that calls Claude on a schedule — a daily summary, a folder watcher — two rules cut the bill hard, at zero cost to quality. (1) Don't call the model when there's nothing to do: check cheaply first ('is there actually new input?'), and only then spend a real call. (2) Keep the instructions you send IDENTICAL across runs — same wording, same order — so the cache can recognise and reuse them. Re-writing the prompt every time is what makes that 'hit rate' low in the first place.

Quick check

You're on a Pro subscription and get the 'prompt cache hit rate is low' email. What should you do?

checkpoint

Your spend, in one sentence

Say it: “Fewer, fuller requests; one clean conversation per task; point it straight at the work — and caching takes care of itself.” That's the whole lesson. Now click complete.

Author

Evgenii Arsentev

PhD · Chief Product Officer at a healthtech company

About the author →