How Do LLMs Work? Explained Without a Single Equation
How do LLMs work? Tokens, context windows, and next-chunk prediction — the three ideas that explain everything AI does, with zero math required.
A large language model predicts the most fitting next chunk of text, having learned patterns from a vast slice of human writing. That's the whole engine. Power emerged from scale, not from a secret spark.
What are tokens and context windows?
Two mechanics explain its quirks: it reads in tokens (chunks, not letters), and it can only reason over what's in its current context window (its working desk). New chat, wiped desk.
How do LLMs work in ChatGPT and Claude?
ChatGPT and Claude are friendly interfaces wrapped around an LLM. Each time you send a message, the whole visible conversation is packed into the context window and the model predicts a reply, chunk by chunk. That's why long chats drift — old details fall off the desk — and why a fresh chat is meeting the model for the first time, every time.

Author
Evgeny Arsentyev
PhD · Chief Product Officer at a healthtech company
▌ Reading is the blue pill
Want to actually build this?
Guides explain. The free course transforms — personalized, gamified, and built to get you shipping fast.
◉ Take the red pill →