DocsHow it works

How it works

Tokenistt sits between your editor and the Anthropic API as a transparent MCP proxy. Every prompt passes through it before being forwarded — adding zero latency to your perceived response time.

Editor / Host

Claude Desktop · Cursor · VS Code · Windsurf

Tokenistt MCP

parse · tokenize · score · cache check

Optimizer (optional)

rewrite · shorten · model route

Anthropic API

POST /v1/messages

Response + Ledger

token delta · cost entry · webhook

What gets analyzed

–Prompt token count (input + system + tools)
–Output token estimate and actual delta
–Cache hit/miss status and prefix fingerprint
–Model selection efficiency (over-provisioning detection)
–Prompt structure waste (redundant instructions, repeated context)
–Per-workspace cost attribution in real time

What never leaves your machine

Tokenistt does not log prompt content by default. Only metadata — token counts, model IDs, latency, cost — is sent to the analytics backend. Enable content_trace: true explicitly to capture full prompts for debugging.