Upload any file — .txt, .pdf, .docx, .json, .py, .md and more — and instantly see how many tokens it uses across GPT-4o, Claude 3.5, Llama 3, and Gemini 1.5. Includes context window usage bars and estimated API costs.
Open LLM Token Counter → free, no sign-inEvery call to a language model API is billed in tokens, not characters or words. Before you send a large document to GPT-4o, Claude, or Gemini, it's useful to know exactly how many tokens it contains — so you can estimate the cost, check whether it fits within the model's context window, and compare how different models would handle the same file.
The LLM Token Counter lets you drop any file and see the token breakdown across the major model families instantly. It extracts text from PDFs and Word documents, runs an accurate BPE-approximation tokenizer in your browser, and shows per-model counts alongside context window usage bars and input cost estimates at current May 2025 pricing.
Developers building LLM-powered features who need to know whether a document fits in a context window before deciding on chunking strategy, or whether a prompt template has grown too expensive.
AI researchers and prompt engineers comparing how efficiently different models tokenize the same corpus — useful when switching providers or optimizing for cost.
Anyone curious about API costs before submitting a large document to a paid LLM endpoint. Drop the file, see the cost estimate, decide whether to trim it first.
Drop or select a file. The tool extracts raw text — reading code and plain-text files directly, using PDF.js for PDFs, and Mammoth for .docx files. The extracted text is then run through an in-browser BPE tokenization approximation that matches GPT-4's cl100k_base tokenizer within roughly ±5% for English prose.
Per-model multipliers adjust the base count: Claude's tokenizer is slightly less efficient (+5%), Llama 3 even more so (+10%), while Gemini 1.5 is marginally more efficient (-5%). Each model family gets its own card showing the token count, a colour-coded context window bar (green under 50%, yellow 50–80%, red over 80%), and the estimated input cost at current pricing.
The "Copy report" button copies a plain-text summary covering all models — convenient for pasting into tickets, documentation, or cost spreadsheets.
| Model | Context | $/1M tokens |
|---|---|---|
| GPT-4o | 128K | $2.50 |
| GPT-4o-mini | 128K | $0.15 |
| GPT-3.5 Turbo | 16K | $0.50 |
| Claude 3.5 Sonnet | 200K | $3.00 |
| Claude 3 Haiku | 200K | $0.25 |
| Llama 3 70B | 8K | $0.59 |
| Gemini 1.5 Pro | 1M | $1.25 |
| Gemini 1.5 Flash | 1M | $0.075 |
What is a token in the context of LLMs?
A token is the basic unit of text that language models process. Tokens are roughly 3–4 characters of English text — so 100 tokens ≈ 75 words. Punctuation, symbols, and non-Latin characters can each cost multiple tokens.
Why do token counts differ between models?
Each model family uses a different BPE vocabulary. GPT-4o and GPT-3.5 share the cl100k_base tokenizer. Claude, Llama 3, and Gemini 1.5 each have their own vocabularies with slightly different efficiencies for English text.
Can I upload PDF and Word files?
Yes — .pdf files are processed with PDF.js and .docx files with Mammoth, both running entirely in-browser. Code, JSON, CSV, Markdown, and plain text are handled natively.
Is my file uploaded anywhere?
No — everything runs locally in your browser. Your file never leaves your device.
How accurate are the token counts?
The BPE-approximation matches the GPT-4 tokenizer within ±5% for typical English prose. For exact production counts, use tiktoken (OpenAI) or the official tokenizer libraries for each model.
Count words, characters, sentences, and reading time
Format, validate, and minify JSON — useful before tokenizing API payloads
Convert .md files to HTML — then count tokens on the output
Drop your file and see the token breakdown across all major LLM providers in seconds.
Open LLM Token Counter → Build your own tool with AI