A GitHub issue exposing a controversial change to Claude Code's prompt caching behavior gained 468 points and 361 comments on Hacker News on April 12, 2026. Anthropic altered the prompt cache Time-To-Live default from a 1-hour TTL to a 5-minute TTL on March 6, 2026, causing significant cost increases for developers who experienced cache expiry after any 5+ minute session pause.
One developer analyzed 119,866 API calls across two machines and documented $949.08 in total overpayment representing 17.1% overall waste. The impact varied by month: January saw $41.45 overpayment (52.5% waste), February only $12.32 (1.1% waste) during the 1-hour TTL period, March spiked to $719.09 overpayment (25.9% waste), and April recorded $176.23 overpayment (14.8% waste). Subscription users hit their 5-hour quota limits for the first time in March.
Token Distribution Reveals Clear Behavior Shift
Data evidence shows a dramatic shift in token distribution on March 6. On February 28, 2026, usage showed 0.00M 5-minute-create tokens and 16.15M 1-hour-create tokens (100% 1-hour). The last clean day on March 5 showed 0.00M 5-minute-create and 6.55M 1-hour-create tokens. On March 6, 5-minute tokens reappeared with 0.29M 5-minute-create and 0.22M 1-hour-create. By March 8, the split reached 16.86M 5-minute-create versus 3.44M 1-hour-create (83% 5-minute), and by March 21 it was 21.37M 5-minute-create versus 1.70M 1-hour-create (93% 5-minute).
The 5-minute TTL causes cache expiry after any 5+ minute session pause, requiring users to re-upload context as expensive 'cache_creation' rather than cheap 'cache_read' operations—a 12.5× price difference. Developers complained about paying an "API tax just to stop and think."
Anthropic Disputes Regression Framing
Anthropic's Jarred Sumner disputed the 'regression' framing, stating "The March 6 change makes Claude Code cheaper, not more expensive." The company claimed the change was intentional optimization work, noting that 1-hour write costs approximately 2× base input versus 1.25× for 5-minute caching. Anthropic argued that one-shot requests would be more expensive under 1-hour caching and stated the client picks per request based on expected cache-reuse patterns with no single global default by design.
The company also noted a client-side bug in v2.1.90 could trap sessions on 5-minute TTL even after quota exhaustion, which has been fixed. Anthropic maintained that "1-hour everywhere would increase total cost given the request mix," though the core dispute remains unresolved: whether the February 1-hour period being stable across two machines for 33 days constitutes evidence it was deliberately intended versus accidental.
Key Takeaways
- Anthropic changed Claude Code's cache TTL from 1-hour to 5-minute on March 6, 2026, causing developer costs to spike
- One developer documented $949.08 in overpayment across 119,866 API calls, with March seeing $719.09 in waste (25.9%)
- The 5-minute TTL creates a 12.5× price penalty when cache expires after brief coding pauses
- Token distribution data shows a clear shift from 100% 1-hour tokens on Feb 28 to 93% 5-minute tokens by March 21
- Anthropic claims the change was intentional optimization that lowers total cost, while developers report it as a silent cost increase