DocsHow it works
← BackHow it works
Tokenistt sits between your editor and the Anthropic API as a transparent MCP proxy. Every prompt passes through it before being forwarded — adding zero latency to your perceived response time.
Editor / Host
Claude Desktop · Cursor · VS Code · Windsurf
Tokenistt MCP
parse · tokenize · score · cache check
Optimizer (optional)
rewrite · shorten · model route
Anthropic API
POST /v1/messages
Response + Ledger
token delta · cost entry · webhook
What gets analyzed
- –Prompt token count (input + system + tools)
- –Output token estimate and actual delta
- –Cache hit/miss status and prefix fingerprint
- –Model selection efficiency (over-provisioning detection)
- –Prompt structure waste (redundant instructions, repeated context)
- –Per-workspace cost attribution in real time
What never leaves your machine
Tokenistt does not log prompt content by default. Only metadata — token counts, model IDs, latency, cost — is sent to the analytics backend. Enable content_trace: true explicitly to capture full prompts for debugging.