Core concepts
Tokens
A token is roughly 4 characters of English text. LLM providers charge per token consumed — both input (your prompt) and output (the response). Tokenistt uses the same tokenizer as the model being called, so counts are exact, not estimated.
# Token breakdown for a typical agent call system prompt: 420 tk ($0.00126) user message: 180 tk ($0.00054) tool definitions: 860 tk ($0.00258) ← common waste vector ───────────────────────────────────── total input: 1,460 tk ($0.00438)
Workspaces
A workspace is a logical grouping of LLM traffic — typically a service, team, or product feature. Tokenistt auto-detects workspaces from MCP client metadata but supports manual tagging via the config file.
Spend attribution
Every request is stamped with a workspace, model, and timestamp. This lets you answer: which team is burning the most tokens? and which function call is the most expensive?
Cache prefix fingerprinting
Anthropic charges less for tokens that hit the prompt cache. Tokenistt hashes the first N tokens of every prompt and tracks reuse frequency, surfacing prompts that would benefit most from caching.
Optimization score
Each prompt receives a 0–100 score. Below 60 triggers a suggestion. Scores account for: token count relative to task complexity, redundancy ratio, and cache eligibility.