How Do LLMs Work? Explained Without a Single Equation

How do LLMs work? Tokens, context windows, and next-chunk prediction — the three ideas that explain everything AI does, with zero math required.

8 min readUpdated 2026-06-12EAEvgeny ArsentyevEvgeny Arsentyev · PhD

A large language model predicts the most fitting next chunk of text, having learned patterns from a vast slice of human writing. That's the whole engine. Power emerged from scale, not from a secret spark.

What are tokens and context windows?

Two mechanics explain its quirks: it reads in tokens (chunks, not letters), and it can only reason over what's in its current context window (its working desk). New chat, wiped desk.

How do LLMs work in ChatGPT and Claude?

ChatGPT and Claude are friendly interfaces wrapped around an LLM. Each time you send a message, the whole visible conversation is packed into the context window and the model predicts a reply, chunk by chunk. That's why long chats drift — old details fall off the desk — and why a fresh chat is meeting the model for the first time, every time.

#llm#fundamentals#beginner
EAEvgeny Arsentyev

Author

Evgeny Arsentyev

PhD · Chief Product Officer at a healthtech company

▌ Reading is the blue pill

Want to actually build this?

Guides explain. The free course transforms — personalized, gamified, and built to get you shipping fast.

◉ Take the red pill →