Era III · 2020–2022
The Instruction-Tuning Watershed
By C.W. Jameson · Published 19 May 2026 · Last reviewed 19 May 2026
Raw GPT-3 was impressive and unusable. It would happily complete a prompt by writing a similar prompt. To make it answer questions, follow instructions, refuse the worst requests, and stop mid-sentence, OpenAI applied a second training stage — supervised fine-tuning on demonstrations, then reinforcement learning from human feedback. The result, InstructGPT, became the immediate ancestor of ChatGPT.
InstructGPT
Ouyang et al., 'Training language models to follow instructions with human feedback', published January 2022. The model was a fine-tune of GPT-3 with SFT + RLHF stages. It was preferred by labelers over the base GPT-3 even at 100× fewer parameters.
ChatGPT
Released 30 November 2022, free, no waitlist. Built on GPT-3.5, an InstructGPT-style model. Reached 100 million weekly users within two months, the fastest consumer-software adoption in history at the time.
Anthropic's Claude
Released in limited beta March 2023 as Claude 1. Used Constitutional AI, a refinement of RLHF where the reward signal came from another model rather than direct human labels.
Signature models of the era
- InstructGPT
- ChatGPT (GPT-3.5)
- Claude 1
- Vicuna (the first credible open-weights instruction-tune)
Technical shifts
- RLHF becomes the standard post-training stage
- Constitutional AI demonstrates RLHF-from-AI-feedback
- Fine-tuning on instruction-response pairs (SFT) becomes table-stakes
Market shifts
- ChatGPT's free release reframes AI as a consumer product
- Microsoft invests $10B in OpenAI
- Google declares 'code red'; Bard launched in haste
Authentication — is the document from this era?
| Tell | Meaning |
|---|---|
| Model refuses with vague 'I cannot help with that' language | Early RLHF lineage; both OpenAI and DeepMind families produce this pattern. |
Primary sources
- [1] Ouyang et al.: InstructGPT — 2022-03-04
- [2] Bai et al.: Constitutional AI — 2022-12-15
From the Almanac shop
The AI Eras — Pocket Field Guide
Ten eras of AI on a single foldable. The Almanac in your pocket.
$19 — Coming soon