Research·8 min
RLHF: The Training Technique That Made ChatGPT Possible
By C.W. Jameson · Published 28 June 2025 · Last reviewed 28 July 2025
RLHF is not the alignment technique. It is the usability technique. The distinction matters more than most people realise.
How Reinforcement Learning from Human Feedback works, its limitations, and why it was the key to making LLMs useful.
Related dispatches