Volume I · The Directory
Agents and models, in current use
Every entry below is something an operator might reach for in 2026. Each profile carries pricing, context, license, and the linguistic fingerprints by which the model can be identified from its output alone.
Frontier models
Claude Opus 4.7
Anthropic · XAnthropic's flagship at the time of this entry: extended thinking on, 1M-token context, the most patient of the frontier models.
Claude Sonnet 4.6
Anthropic · XThe workhorse. Faster than Opus, smarter than Haiku, the model most production deployments end up on.
Claude Haiku 4.5
Anthropic · XThe fast one. Used for sub-agents, high-volume classification, and the parts of an agent loop where capability matters less than throughput.
GPT-5
OpenAI · IXOpenAI's reasoning-first flagship. Native chain-of-thought, three reasoning-effort tiers, the highest published benchmark scores at release.
GPT-4.5
OpenAI · VIIThe last non-reasoning OpenAI flagship. Better at vibes than reasoning. Quietly deprecated in favor of GPT-5.
Gemini 2.5 Pro
Google DeepMind · IXGoogle's reasoning flagship. Two-million-token context, native multimodal, the only frontier model that reads PDFs without an extraction pre-pass.
Mistral Large 3
Mistral · XThe European frontier model. Codes well, refuses politely, and keeps your data in the EU.
Grok 4
xAI · IXElon's reasoning flagship. Native Twitter/X integration, willing to discuss what other models won't.
OpenAI o3
OpenAI · IXOpenAI's peak reasoning model before GPT-5. AIME, ARC-AGI, and SWE-bench records at release.
Claude 3.5 Sonnet
Anthropic · VIIThe model that made Claude the coding default. Coding benchmarks beat GPT-4o at launch.
Gemini Flash 2.0
Google DeepMind · XGoogle's workhorse: fast, cheap, multimodal. The Sonnet of the Gemini family.
Command R+
Cohere · VIIEnterprise RAG specialist. Built for retrieval pipelines, not chat.
Claude 2
Anthropic · IVAnthropic's first publicly competitive release. 100K context when GPT-4 had 8K.
Gemini 1.5 Pro
Google DeepMind · VIIThe model that proved 1M-token context was practical. Video and audio native.
Claude 3 Opus
Anthropic · VIIThe model that finally made the 'Claude is better than GPT-4' argument defensible.
Gemini Ultra 1.0
Google DeepMind · VIIGoogle's first frontier-class model, released to counter GPT-4 and Claude 3.
GPT-4o
OpenAI · VThe omni model. Text, image, audio natively in one system. Speed doubled vs. GPT-4.
GPT-4 Turbo
OpenAI · VII128K context, knowledge cutoff updated, cost dropped 3×. The model that made GPT-4 accessible.
Open-weights models
DeepSeek R1
DeepSeek · IXThe open-weights reasoning model that printed an industry shockwave. Trained at a fraction of frontier-lab costs.
Llama 4
Meta · XMeta's fourth Llama generation. Three sizes, all-MoE, the open-weights default for serious self-hosters.
Qwen 3
Alibaba · XAlibaba's flagship open-weights family. Best non-English capability among open models, strong on coding.
Mistral 7B
Mistral · IVThe model that proved 7B parameters could beat 13B models. Reshaped expectations for small open models.
Phi-4
Microsoft · IXMicrosoft's small-but-capable reasoning model. Punches above its 14B parameter count.
Llama 3.1 405B
Meta · VIIThe model that first made the open-weights vs. closed-source comparison uncomfortable for the labs.
Coding agents
Claude Code
Anthropic · VIIIAnthropic's official CLI coding agent. Long-horizon refactors, tool-use harness, MCP server support.
Cursor
Anysphere · VIIIThe IDE-shaped coding agent. Tab completion, agent mode, the first $100M ARR coding tool.
Aider
Paul Gauthier / open source · VIIITerminal-native git-aware coding agent. Bring your own API key. Read the source.
Cline
Saoud Rizwan / open source · VIIIThe VS Code extension that does the work in your editor — read, write, run, browse — with permission gates.
GitHub Copilot
GitHub / Microsoft · VThe original. Tab-completion that started the coding-agent market.
Codestral
Mistral · VIIIMistral's coding specialist. 80+ programming languages, fill-in-the-middle, low latency.
General-purpose agent frameworks
Devin
Cognition Labs · VIIIThe first 'fully autonomous software engineer' marketed at scale. Browser + shell + planning loop, fully cloud-hosted.
Anthropic SDK / Managed Agents
Anthropic · XBuilding blocks for custom agents — tool use, prompt caching, batch API, extended thinking, memory.
OpenAI Assistants API
OpenAI · VOpenAI's hosted agent runtime. Threads, tools, file search, code interpreter — managed for you.
LangChain
LangChain, Inc. · VThe framework that defined agent chaining before 'agent' was a product category.
OpenAI Swarm
OpenAI · XOpenAI's educational multi-agent framework. Minimal, opinionated, not production-hardened.
Model Context Protocol (MCP)
Anthropic · XThe open protocol that standardised tool connections for LLM agents. Now an industry standard.
Vertex AI Agent Builder
Google Cloud · XGoogle Cloud's enterprise agent platform. RAG + Gemini + enterprise controls in one product.
Amazon Bedrock
AWS · VAWS's model marketplace. Run Claude, Llama, Titan — whatever model you need — inside your VPC.
Computer-use agents
Specialty agents
Perplexity
Perplexity AI · VSearch-first answer engine. Citations on every claim, model-agnostic, the first credible Google alternative for fact retrieval.
Stable Diffusion 3
Stability AI · VIOpen-weights image generation flagship. Text rendering finally works.
Midjourney v7
Midjourney · VIThe image generation model that made AI art a creative category. Photorealism and style control best-in-class.
Whisper
OpenAI · VOpen-weights speech recognition. The transcription backbone for most AI pipelines.
Google NotebookLM
Google · XAI notebook with source grounding. Generates audio summaries. No hallucination outside uploaded sources.
Together AI
Together AI · IXInference-as-a-service for open-weights models. Fastest Llama, DeepSeek, and Mixtral access.