Research·9 min

Constitutional AI: How Anthropic Trains Safe Models

By C.W. Jameson · Published 5 January 2026 · Last reviewed 5 February 2026

Constitutional AI replaces human labellers with a model judging its own outputs against a written constitution.

What Constitutional AI is, how RLHF and CAI differ, and why it matters for the reliability of Claude models.

Related dispatches