Dictionary
Context Window
The maximum amount of text (in tokens) an LLM can consider in a single call.
Definition
The Context Window is the hard limit on how much input + output a model can hold at once, measured in tokens. Larger windows allow longer documents and richer agent state, but cost more and can dilute attention — making retrieval and summarization still essential.
Example
A 1M-token model could read an entire codebase at once, but a focused 8K-token RAG pipeline answering one question is often cheaper, faster and more accurate.
Related Workflows
Related Tool Stacks