Architecture
Retrieval-Augmented Generation Pipeline
Connecting LLMs to enterprise knowledge bases for grounded, citeable answers.
Recommended agents
AnthropicCohereOpenAI
Claude Sonnet 4.6
The workhorse. Faster than Opus, smarter than Haiku, the model most production deployments end up on.
Command R+
Enterprise RAG specialist. Built for retrieval pipelines, not chat.
OpenAI Assistants API
OpenAI's hosted agent runtime. Threads, tools, file search, code interpreter — managed for you.
Budget pick
Command R+
$2.50-$10 per million tokens
Key considerations
- ›Embedding model selection
- ›Chunk size
- ›Reranker quality