kwj.ai · acquisition inquiries from >$999view prospectus →
The Domesday Book ofKWJ · AI

OpenAI · The Tool-Use Inflection

Whisper

Open-weights speech recognition. The transcription backbone for most AI pipelines.

By C.W. Jameson · Published 19 May 2026 · Last reviewed 19 May 2026

Whisper was released in September 2022 and immediately became the go-to speech recognition model for developers. OpenAI released weights across five sizes; the large-v2 model achieved near-human transcription accuracy on most English audio. The model has remained the transcription backbone for most AI pipelines that require speech input, even as newer alternatives have emerged.

Field signature

Timestamps at word level in verbose mode.

Specifications

Released2022-09-21
Context window30 seconds of audio per chunk
Pricing$0.006 per minute (API); free self-hosted
Modalitiesaudio-to-text
LicenseMIT
EraThe Tool-Use Inflection

Strengths

  • Accuracy
  • Open weights
  • Multilingual

Weaknesses

  • 30-second chunking required
  • Real-time latency

Authentication markers

The fingerprints by which Whisper can be identified from its output alone.

TellMeaning
Word-level timestamps in verbose output.Whisper or fine-tune.

Notable works

  • Transcription backbone for most production AI voice pipelines

Market position

$0.006/minute via API; free self-hosted

Partner offer

OpenAI's API surface remains the broadest commercial offering.

Try OpenAI →

Affiliate link — see disclosure.

Primary sources

  1. [1] OpenAI: Whisper

From the Almanac shop

The Operator's Compendium

Every agent harness, every routing pattern, every cost trick. 90-page PDF.

$29Coming soon

Back to the directory