Together AI
infrastructureUS · Founded 2022
Fastest inference for open-weights models. OpenAI-compatible API.
Together AI provides optimised inference infrastructure for open-weights models. The company maintains GPU clusters tuned for the most-demanded open models and consistently achieves lower latency than self-hosting on equivalent hardware. The OpenAI-compatible API makes migration from direct API access straightforward.
Main models
- Llama (via Together)
- Qwen (via Together)
- DeepSeek (via Together)
Strengths
- Speed
- Model variety
- OpenAI-compatible API
- No proprietary lock-in
Pricing
Per-token; varies by model ($0.20–$5 per million tokens typical)