OpenAI · The Tool-Use Inflection
Whisper
Open-weights speech recognition. The transcription backbone for most AI pipelines.
By C.W. Jameson · Published 19 May 2026 · Last reviewed 19 May 2026
Whisper was released in September 2022 and immediately became the go-to speech recognition model for developers. OpenAI released weights across five sizes; the large-v2 model achieved near-human transcription accuracy on most English audio. The model has remained the transcription backbone for most AI pipelines that require speech input, even as newer alternatives have emerged.
Field signature
Timestamps at word level in verbose mode.
Specifications
| Released | 2022-09-21 |
|---|---|
| Context window | 30 seconds of audio per chunk |
| Pricing | $0.006 per minute (API); free self-hosted |
| Modalities | audio-to-text |
| License | MIT |
| Era | The Tool-Use Inflection |
Strengths
- Accuracy
- Open weights
- Multilingual
Weaknesses
- 30-second chunking required
- Real-time latency
Authentication markers
The fingerprints by which Whisper can be identified from its output alone.
| Tell | Meaning |
|---|---|
| Word-level timestamps in verbose output. | Whisper or fine-tune. |
Notable works
- Transcription backbone for most production AI voice pipelines
Market position
$0.006/minute via API; free self-hosted
Partner offer
OpenAI's API surface remains the broadest commercial offering.
Try OpenAI →Affiliate link — see disclosure.
Primary sources
- [1] OpenAI: Whisper
From the Almanac shop
The Operator's Compendium
Every agent harness, every routing pattern, every cost trick. 90-page PDF.
$29 — Coming soon