What is a context window? Definition for large language models
A context window is the maximum number of tokens a large language model can attend to in one request — the upper bound on how much input plus history the model can read at once.
A context window is the maximum number of tokens a large language model can attend to in one request — the upper bound on how much input, conversation history, and supporting material the model can read at once.
The number matters because anything past the window is invisible to the model. A 200,000-token codebase fed to a model with a 32,000-token window has 168,000 tokens the model never sees, and there is no way to ask about them.
Why size is not the whole story
Models often attend unevenly across a long window — strong at the start and end, weaker in the middle. A focused 5,000-token input can outperform a sprawling 500,000-token one stuffed with peripheral material. Bigger windows enable more use cases; they do not automatically improve quality on the use cases that already fit.
The practical consequence for product feedback: keep the input lean. A markdown document with the cropped screenshots, the source URLs, and the specific notes is more useful to the agent than the same document plus a hundred pages of background. Curation is part of the prompt.
Frequently asked questions
How big is a typical context window?
It varies by model. Modern frontier models offer hundreds of thousands of tokens, sometimes more than a million. Smaller and older models top out in the tens of thousands.
Does a bigger window always mean better answers?
No. Models often pay attention unevenly across a large window, especially in the middle. A focused, smaller input frequently outperforms a kitchen-sink one stuffed into a million-token window.
Capture your first review.
About a minute from open tab to a shareable URL your agent can ingest.
Start capturing