Volume IV · The Guides

Operator guides

Written for the engineer or analyst who has to choose, deploy, debug, or budget a model. Each piece carries the perspective of having actually done the work.

12 min

How to pick an LLM for your workload

A decision tree that takes thirty minutes to walk and saves six months of switching costs.

Selection

9 min

Architecting prompts for 10× cost reduction with caching

Anthropic's prompt cache saves 90% on cache hits. The architecture that earns those hits looks deliberate.

Cost

14 min

Building an agent loop you can read in a single sitting

Every operator eventually writes their own harness. Here is the shape of a maintainable one.

Engineering

8 min

Model routing: running cheap when you can, expensive when you must

Routing turns a $20/day workload into a $4/day workload without losing capability.

Cost

10 min

Evaluating an agent the way operators actually do

Capability benchmarks measure capability. Operators want to measure deployability. The two are different.

Operations

11 min

Self-hosting open-weights: when it pays and when it doesn't

Self-hosting Llama or Qwen makes sense at the scale where you stop counting dollars and start counting hours.

Operations

9 min

Prompt-engineering vs fine-tuning: the breakeven

Most prompt-engineering problems are not fine-tuning problems. The reverse is also true.

Engineering

7 min

Structured outputs and the JSON-adherence problem

Getting reliable JSON out of an LLM is now table-stakes. The pitfalls are real.

Engineering

8 min

Computer-use agents: current reliability and where to use them

Anthropic's Computer Use and OpenAI's Operator both work. Both are slower and less reliable than purpose-built tools. Pick deliberately.

Agents

10 min

Claude Code vs Cursor: operator's comparison

Two coding agents, two philosophies. Pick by the kind of session, not the kind of operator.

Tools

Identification field guide →