How Do LLMs Work? Explained Without a Single Equation

How do LLMs work? Tokens, context windows, and next-chunk prediction — the three ideas that explain everything AI does, with zero math required.

◷ 8 min read↻ Updated 2026-06-12· Published 2026-05-12EA

Evgeny Arsentyev · PhD

A large language model predicts the most fitting next chunk of text, having learned patterns from a vast slice of human writing. That's the whole engine. Power emerged from scale, not from a secret spark.

What are tokens and context windows?

Two mechanics explain its quirks: it reads in tokens (chunks, not letters), and it can only reason over what's in its current context window (its working desk). New chat, wiped desk.

How do LLMs work in ChatGPT and Claude?

ChatGPT and Claude are friendly interfaces wrapped around an LLM. Each time you send a message, the whole visible conversation is packed into the context window and the model predicts a reply, chunk by chunk. That's why long chats drift — old details fall off the desk — and why a fresh chat is meeting the model for the first time, every time.

#llm#fundamentals#beginner

Author

Evgeny Arsentyev

PhD · Chief Product Officer at a healthtech company

About the author →

▌ Reading is the blue pill

Want to actually build this?

Guides explain. The free course transforms — personalized, gamified, and built to get you shipping fast.

◉ Take the red pill →

Keep going

Getting Started

How Do LLMs Work? Explained Without a Single Equation

What are tokens and context windows?

How do LLMs work in ChatGPT and Claude?

Want to actually build this?

Keep going

ChatGPT vs Claude: Which Is Better in 2026 (and for What)

What Is an AI Hallucination (and Why AI Confidently Lies)

How to Write an AI Prompt: Stop Asking Tiny Questions