Context Window Calculator
Paste any text and instantly see how much of your selected context window it fills. Warns at 70%, 85%, and 100% β before you hit the API limit.
Select context window
Models: GPT-4o, GPT-4 Turbo, Llama 3.3 70B
Est. Tokens
0
~4 chars/token
Context Limit
128,000
128K window
Remaining Tokens
128,000
Remaining Chars
~512,000
estimated
Understanding Context Windows
What is a context window?
Think of the context window as the model's working memory β the total amount of text it can "see" at once. Every token in your conversation counts toward this limit: system prompt, chat history, attached documents, and the model's own replies.
When you exceed this limit, the API either rejects your request with an error, or (in older APIs) silently drops the oldest messages β which can cause the model to forget earlier context and give inconsistent answers.
Why prompts fail with long documents
// Total tokens sent to API
total = system_prompt_tokens
+ document_tokens
+ user_message_tokens
+ expected_output_tokens
// If total > context_limit β Error!
Output tokens count too β always leave room for the model's response (a common rule: use max 80% of the window for input).
Context window reference (2026)
| Window | Tokens | Example models |
|---|---|---|
| 8K | 8,192 | GPT-3.5 Turbo, GPT-4 (8K) |
| 32K | 32,768 | GPT-4 (32K), Claude Instant |
| 128K | 128,000 | GPT-4o, GPT-5, Llama 3.3 70B |
| 200K | 200,000 | Claude Opus 4.8, Claude Sonnet 4.6, Haiku 4.5 |
| 1M | 1,000,000 | GPT-4.1, Gemini 2.5 Flash, Llama 4 Maverick |
| 2M | 2,000,000 | Gemini 2.5 Pro, Gemini 3.1 Pro |
Pages estimated at ~250 words/page Β· 1 token β 0.75 words