kwj.ai · acquisition inquiries from >$999view prospectus →
The Domesday Book ofKWJ · AI

Field identification

Prompt caching, the 90%-discount most operators don't use

Anthropic, OpenAI, and Gemini all offer cached-prefix discounts. The architectures that take advantage of them look different from the ones that don't.

By C.W. Jameson · Published 19 May 2026 · Last reviewed 19 May 2026

Prompt caching converts a recurring prefix into a 90%-discounted read for a short window. Operators who structure their prompts around it pay 5-10× less than operators who don't. The architectural pattern is: put the static parts of the prompt at the top, the dynamic parts at the bottom, and never put a fresh timestamp anywhere near the cache breakpoint.

Anthropic cache

Mark up to four cache breakpoints in a prompt. The longest prefix that hits a breakpoint is cached for 5 minutes. Subsequent reads of that prefix bill at 10% of input rate. Cache writes are 25% above input rate, so the breakeven is approximately 2-3 reads before the cache pays for itself.

OpenAI cache

Automatic, no explicit breakpoints. Cache TTL is shorter (typically 5-10 minutes). Discount is similar (~50%, varies by model).

Architecture

System prompts, tool definitions, and document context belong above the cache breakpoint. User turn and dynamic state belong below. A common mistake: putting the current timestamp at the top of the prompt for 'context', which invalidates every cache.

Tells

MarkerMeaning
Cache hit-rate visible in API response is below 30% on a long-running agentPrefix is fluctuating; restructure to stabilise it.
Bill stays high across agent turns despite identical-looking promptsCache misses; usually a timestamp or randomised field at the top.

Frequently asked

Does caching survive a model switch?

No; cache is per-model.

What about cross-provider caching?

Doesn't exist. Each provider's cache is local.

From the Almanac shop

Model Tells — Flashcard Deck

Identify any frontier model from a paragraph of output. 60 cards.

$14Coming soon

All identification topics