kwj.ai · acquisition inquiries from >$999view prospectus →
The Domesday Book ofKWJ · AI

Era III · 2020–2022

The Instruction-Tuning Watershed

By C.W. Jameson · Published 19 May 2026 · Last reviewed 19 May 2026

Raw GPT-3 was impressive and unusable. It would happily complete a prompt by writing a similar prompt. To make it answer questions, follow instructions, refuse the worst requests, and stop mid-sentence, OpenAI applied a second training stage — supervised fine-tuning on demonstrations, then reinforcement learning from human feedback. The result, InstructGPT, became the immediate ancestor of ChatGPT.

InstructGPT

Ouyang et al., 'Training language models to follow instructions with human feedback', published January 2022. The model was a fine-tune of GPT-3 with SFT + RLHF stages. It was preferred by labelers over the base GPT-3 even at 100× fewer parameters.

ChatGPT

Released 30 November 2022, free, no waitlist. Built on GPT-3.5, an InstructGPT-style model. Reached 100 million weekly users within two months, the fastest consumer-software adoption in history at the time.

Anthropic's Claude

Released in limited beta March 2023 as Claude 1. Used Constitutional AI, a refinement of RLHF where the reward signal came from another model rather than direct human labels.

Signature models of the era

  • InstructGPT
  • ChatGPT (GPT-3.5)
  • Claude 1
  • Vicuna (the first credible open-weights instruction-tune)

Technical shifts

  • RLHF becomes the standard post-training stage
  • Constitutional AI demonstrates RLHF-from-AI-feedback
  • Fine-tuning on instruction-response pairs (SFT) becomes table-stakes

Market shifts

  • ChatGPT's free release reframes AI as a consumer product
  • Microsoft invests $10B in OpenAI
  • Google declares 'code red'; Bard launched in haste

Authentication — is the document from this era?

TellMeaning
Model refuses with vague 'I cannot help with that' languageEarly RLHF lineage; both OpenAI and DeepMind families produce this pattern.

Primary sources

  1. [1] Ouyang et al.: InstructGPT2022-03-04
  2. [2] Bai et al.: Constitutional AI2022-12-15

From the Almanac shop

The AI Eras — Pocket Field Guide

Ten eras of AI on a single foldable. The Almanac in your pocket.

$19Coming soon

Back to the timeline