Research·8 min
SWE-bench: The Benchmark That Redefined AI Coding
By C.W. Jameson · Published 15 October 2025 · Last reviewed 15 November 2025
SWE-bench tasks agents with fixing real GitHub issues. The progression from 5% to over 50% in 18 months was extraordinary.
How SWE-bench was created, why it matters, and what the score progression from 5% to 50%+ reveals about AI coding ability.
Related dispatches