Skip to content
claude-code · · 12 min read

Claude Code Token Metrics: Technical Appendix

Source code verification, data model reference, pricing tables, reproduction scripts, and the full format evolution timeline behind our token metrics investigation.

This is the technical companion to Understanding Claude Code Token Metrics. Everything below is the evidence and methodology behind the main post’s findings.

Data Model Reference

Directory structure

~/.claude/
  projects/
    {encoded-path}/                          # path with / replaced by -
      {session-uuid}.jsonl                   # main conversation log
      {session-uuid}/subagents/
        agent-{id}.jsonl                     # subagent conversations
  sessions/
    {pid}.json                               # pid -> sessionId, cwd, startedAt
  stats-cache.json                           # precomputed daily stats
  history.jsonl                              # input history (no token data)

Message types in session JSONL

TypePurposeHas usage data?
progressHook/plugin lifecycle eventsNo
userUser messages and tool resultsNo
assistantModel responses (streaming chunks)Yes
systemTurn duration, local commandsNo
file-history-snapshotFile state for undo/restoreNo

The usage object (on type: "assistant" messages)

{
  "message": {
    "usage": {
      "input_tokens": 241,
      "output_tokens": 168,
      "cache_creation_input_tokens": 492,
      "cache_read_input_tokens": 49336,
      "cache_creation": {
        "ephemeral_5m_input_tokens": 0,
        "ephemeral_1h_input_tokens": 492
      },
      "service_tier": "standard",
      "speed": "standard",
      "inference_geo": "",
      "server_tool_use": {
        "web_search_requests": 0,
        "web_fetch_requests": 0
      }
    },
    "model": "claude-opus-4-6",
    "id": "msg_01...",
    "stop_reason": "end_turn"
  },
  "requestId": "req_...",
  "uuid": "unique-per-streaming-chunk",
  "isSidechain": false,
  "parentUuid": "previous-message-uuid",
  "sessionId": "session-uuid",
  "timestamp": "2026-03-22T...",
  "version": "2.1.76"
}

Field reference for tool builders

FieldPurposeCost relevance
input_tokensNew context sent to modelBase input price
output_tokensGenerated response~5x input price
cache_creation_input_tokensContext being cached1.25x or 2x input
cache_read_input_tokensReused cached context0.1x input price
ephemeral_5m_input_tokens5-minute cache creation1.25x input
ephemeral_1h_input_tokens1-hour cache creation2x input
service_tierStandard vs enterpriseDifferent rate limits
speedStandard vs fastFast = 6x (Opus 4.6)
inference_geoWhere inference ranUS-only = 1.1x
server_tool_useWeb search/fetch counts$10 per 1k searches
uuidPer-streaming-chunk IDNOT a dedup key
requestIdPer-API-request IDCorrect dedup key
message.idPer-message ID1:1 with requestId
isSidechainSubagent vs main threadSeparate for attribution
stop_reasonnull (intermediate) or “end_turn”/“tool_use” (final)Final chunk has real output_tokens

Streaming chunk behavior

A single API response writes 2-10+ JSONL lines, one per content block:

Line 1: thinking block   → uuid: "aaa", requestId: "req_123", output_tokens: 9,   stop_reason: null
Line 2: text block       → uuid: "bbb", requestId: "req_123", output_tokens: 10,  stop_reason: null
Line 3: tool_use block   → uuid: "ccc", requestId: "req_123", output_tokens: 269, stop_reason: "tool_use"
  • uuid is unique per line — deduplicating by uuid treats each chunk as a separate response
  • requestId is shared — deduplicating by requestId correctly groups them as one response
  • Input tokens and cache tokens are consistent across chunks
  • Output tokens differ: intermediate chunks have placeholder values (~1-11), only the final chunk (with stop_reason != null) has the real total
  • jq unique_by keeps the first occurrence, so it gets the placeholder output_tokens

Correct approach: group by requestId, keep the entry with stop_reason != null.

Dataset statistics

MetricValue
Active days77 (of 77 calendar days with data)
Sessions298
Projects10
Main session JSONL files169
Subagent JSONL files1,168
Total JSONL files1,337
Raw assistant JSONL lines87,684
Unique API requests (requestId dedup)30,746
Streaming chunk ratio2.85x
Models usedOpus 4.6, Haiku 4.5, Sonnet 4.6, Sonnet 4.5, Opus 4.5

Source Code Verification

ccusage v18.0.10 (TypeScript, 11.8k stars)

Dedup implementationapps/ccusage/src/data-loader.ts, line 530:

export function createUniqueHash(data: UsageData): string | null {
    const messageId = data.message.id;
    const requestId = data.requestId;
    if (messageId == null || requestId == null) {
        return null;
    }
    return `${messageId}:${requestId}`;
}

A Set<string> called processedHashes tracks seen combinations. Entries with null message.id or requestId are never deduplicated (always counted).

BehaviorStatusImpact
Dedup key: message.id:requestIdCorrectPrevents streaming chunk double-counting
First-seen-wins (keeps first chunk)WrongUndercounts output tokens ~5x (#888, open)
Entries with null identifiersNot dedupedOlder JSONL entries always counted
Scans subagent directoriesYes (via **/*.jsonl)Includes subagent usage
Filters isSidechainNoSchema doesn’t parse this field
5m vs 1h cache writesNot distinguished~19% cost underestimate (#899, open)
Fast mode pricing (6x)YesAdded March 2026
Tiered pricing (>200k)YesSupports above-200k rates
Session-level dedupNoloadSessionUsageById has no dedup

claudelytics v0.5.2 (Rust, 70 stars)

Token aggregationsrc/models.rs, line 57:

pub fn total_tokens(&self) -> u64 {
    self.input_tokens + self.output_tokens +
    self.cache_creation_tokens + self.cache_read_tokens
}
BehaviorStatusImpact
DedupNoneUsageRecord doesn’t parse uuid, requestId, or message.id
”Dedup” keyword searchZero hits in Rust sourceConfirmed structurally impossible
Total tokens formulainput + output + cache_read + cache_creationCache reads inflate number by orders of magnitude
Scans subagent directoriesYes (WalkDir, no depth limit)Includes all nested files
Streaming chunk handlingNoneEvery JSONL line with usage gets counted
Type filteringNoneSchema doesn’t include type field

/stats (Claude Code built-in)

Verified from ~/.claude/stats-cache.json:

Total tokens = inputTokens + outputTokens
FieldValueIn “Total tokens”?
inputTokens1,082,937Yes
outputTokens8,279,640Yes
Sum9,362,577 ≈ 9.4MYes — exact match
cacheReadInputTokens5,046,513,967No
cacheCreationInputTokens234,307,960No

stats-cache.json also contains dailyModelTokens with per-day, per-model breakdowns. Summing all tokensByModel entries = 9,362,577 — confirming the formula.

ccost v0.2.0 (Rust, 6 stars, abandoned)

BehaviorStatus
Dedup key: message.id + requestIdCorrect (hash-prefixed: “req:”, “session:“)
Fallback: message.id + sessionIdYes (when requestId absent)
Which chunk keptUnknown — predates thinking blocks
5m vs 1h cache writesNot distinguished
Subagent scanningNo
isSidechain filteringNo
Last commitJune 21, 2025

Dedup Strategy Comparison

All strategies applied to the same dataset (1,337 files, 87,684 raw assistant lines):

StrategyUnique entriesInput tokensOutput tokensIn+Out
No dedup (raw lines)87,684massively inflated
uuid (jq script)34,806564,5605,656,1606,220,720
requestId (correct)17,698175,0223,731,3243,906,346
message.id17,737175,0223,731,3243,906,346

Note: requestId and message.id produce identical results — they are 1:1 across all sampled data.

With subagent/sidechain separation (requestId dedup)

CategoryRequestsInputOutputIn+Out
All30,7461,137,8736,120,4487,258,321
Main thread only14,555135,4352,353,6012,489,036
Subagent only16,1911,002,4383,766,8474,769,285

Per-model breakdown (requestId dedup, all files)

ModelRequestsInputOutputCache ReadCache Write
Opus 4.623,078508,7935,045,7982,304,573,99983,429,791
Haiku 4.57,134608,8661,060,818402,681,37632,470,119
Sonnet 4.653420,21413,83223,919,7892,982,040

Pricing Reference

Source: Anthropic Pricing (verified 2026-03-22)

Per-model rates ($ per million tokens)

ModelInput5m Cache Write1h Cache WriteCache ReadOutput
Opus 4.6$5.00$6.25$10.00$0.50$25.00
Opus 4.5$5.00$6.25$10.00$0.50$25.00
Sonnet 4.6$3.00$3.75$6.00$0.30$15.00
Sonnet 4.5$3.00$3.75$6.00$0.30$15.00
Haiku 4.5$1.00$1.25$2.00$0.10$5.00

Pricing multipliers

FactorMultiplierNotes
5-minute cache write1.25x base inputDefault cache tier
1-hour cache write2x base inputExtended cache
Cache read (hit)0.1x base inputMassive discount
Fast mode (Opus 4.6)6x all ratesBeta
Data residency (US-only)1.1x all ratesOpus 4.6+
Long context (>200k input)2x input, 1.5x outputSonnet 4.5/4 only
Batch API0.5x all ratesAsync processing

Cost analysis from our dataset

ModelReal cost (cache-aware)Naive cost (all @ input rate)
Opus 4.6$2,059$11,954
Haiku 4.5$96$437
Sonnet 4.6$19$70
Others$10$25
Total$2,184$12,487 (5.7x)

Worst case (all tokens at output rate): $62,433 (29x overcounting).


Format Evolution Timeline

Phase 1: The Simple Era (pre-June 2025)

JSONL files had a costUSD field. One line = one message = one cost.

Phase 2: costUSD Removed (June 2025, v1.0.9)

ccusage #4: Tools pivoted to token-based cost calculation via LiteLLM.

Phase 3: Thinking Blocks (mid-2025)

One response became 3-6+ JSONL lines (thinking, text, tool_use). Each has unique uuid, shared requestId. Counting broke silently.

Phase 4: Subagents (mid-2025)

Agent tool introduced, writing to subagents/ directories. ccusage #313: tools didn’t scan these. Users hit limits while tools showed plenty of headroom.

Phase 5: Output Token Bug Surfaced (Feb 2026)

Claude Code #22686: intermediate chunks have placeholder output_tokens. Known since #10259 (mid-2025).

Phase 6: Cache Pricing Split (2025-2026)

Two tiers introduced: 5-minute (1.25x) and 1-hour (2x). JSONL includes both. No tool distinguishes them (ccusage #899).


Verification Results

Each claim was independently verified by a separate analysis agent against raw data:

ClaimVerdictMethod
Streaming chunks share requestIdPASS3 session files, 2-6 chunks per requestId
Intermediate chunks have placeholder output_tokensPASS~8-11 on non-final vs real values on final
uuid overcounts vs requestId (2-6x)PASS6x overcounting demonstrated; 2.85x corpus average
requestId and message.id are 1:1PASSAll sampled data, no exceptions
Cache read = ~95.8% of totalPLAUSIBLE0% on first turn, 98% on subsequent; aggregate plausible
unique_by(.uuid) is wrongPASS6x overcounting + wrong output_tokens
isSidechain clean splitPASS11 files (6 subagent, 5 main), no exceptions
/stats = input + output onlyPASSstats-cache.json math matches exactly
ccusage first-seen-wins bugPASSSource code confirmed, matches #888
claudelytics zero dedupPASSSource code: no HashSet, no dedup logic

Reproduction Scripts

Correct dedup count (Python)

import json, glob, os

files = glob.glob(os.path.expanduser('~/.claude/projects/**/*.jsonl'), recursive=True)
by_request = {}
for f in files:
    with open(f) as fh:
        for line in fh:
            try:
                d = json.loads(line)
            except:
                continue
            if d.get('type') != 'assistant':
                continue
            usage = d.get('message', {}).get('usage', {})
            rid = d.get('requestId', '')
            stop = d.get('message', {}).get('stop_reason')
            if rid and (rid not in by_request or stop):
                by_request[rid] = usage

print(f"Unique requests: {len(by_request)}")
print(f"Input: {sum(u.get('input_tokens',0) for u in by_request.values()):,}")
print(f"Output: {sum(u.get('output_tokens',0) for u in by_request.values()):,}")
print(f"Cache read: {sum(u.get('cache_read_input_tokens',0) for u in by_request.values()):,}")
print(f"Cache write: {sum(u.get('cache_creation_input_tokens',0) for u in by_request.values()):,}")

Incorrect jq approach (for comparison)

cat ~/.claude/projects/*/*.jsonl | \
jq -R 'fromjson? | .. | objects | select(.usage) | \
  {uuid: (.uuid // .id), in: (.usage.input_tokens // 0), out: (.usage.output_tokens // 0)}' | \
jq -s 'unique_by(.uuid) | {
  total_input: (map(.in) | add // 0),
  total_output: (map(.out) | add // 0),
  total_combined: ((map(.in) | add // 0) + (map(.out) | add // 0))
}'

Problems: overcounts ~1.6x (uuid is per-chunk), misses subagents (no recursive glob), gets wrong output_tokens (keeps first/placeholder value).


Remaining Open Questions

  • How do compacted sessions (/compact, /clear) affect JSONL structure and dedup?
  • What are <synthetic> model messages? (65 found in dataset)
  • How does context window overflow/truncation affect logged usage data?
  • Does Anthropic’s internal billing match the JSONL usage objects exactly?

Sources

GitHub Issues (ccusage)

GitHub Issues (Claude Code)

Tools

Official

Community