kwj.ai · acquisition inquiries from >$999view prospectus →
The Domesday Book ofKWJ · AI

Field identification

Identifying a model from its prose alone

Every frontier family leaves linguistic fingerprints. Here is the field guide.

By C.W. Jameson · Published 19 May 2026 · Last reviewed 19 May 2026

Most operators eventually find themselves staring at a chunk of text and asking which model wrote it. Sometimes a competitor has anonymised an output; sometimes a client refuses to say which provider they used; sometimes you inherit a corpus and need to know what produced it. The good news is that the four frontier families (Claude, GPT, Gemini, Grok) leak their lineage on the surface. Below are the tells.

Claude (Opus, Sonnet, Haiku)

Hedges before strong claims. Will write 'I notice…' or 'Let me check…' in agent contexts. Refuses with a specific alternative rather than a vague decline. Sentence length varies more than the GPT family. Em-dashes are present but not heavy. Will use British or American spelling consistently within a document.

GPT (4.x, 4o, 5)

Em-dash density 2-4× the typical human rate. Loves to begin paragraphs with 'It's important to note…' and 'In summary…'. Refuses with vague language. GPT-5 introduces a 'reasoning' field that, if exposed in the harness, is the single cleanest tell.

Gemini

Lists. Aggressively. Will turn even a chatty paragraph into a bulleted breakdown. Long-context coherence is the differentiator: a Gemini-written summary of a 200-page document will reference late material without losing structure.

Grok

Willing to render opinions on living people that other frontier models refuse. Higher rate of unhedged claims. Lower refusal rate generally.

Open-weights (Llama, Qwen, DeepSeek)

Tokenizer artefacts show up on non-English text — odd spacing around CJK or Arabic, occasional code-mixing. Qwen will switch to Chinese on sufficiently long outputs. DeepSeek R1 emits visible <think> blocks. Llama derivatives leak the system prompt under jailbreak pressure more reliably than closed models do.

Tells

MarkerMeaning
Em-dash density ≥ 4 per 1,000 wordsAlmost certainly GPT-4.5 or later.
Visible <think> block before answerDeepSeek R1 family or a derivative.
Reasoning field separate from content in JSON outputOpenAI o-series or GPT-5.
Inline numbered citations on every factual claimPerplexity or a clone.
Hedging before tool calls in agent transcriptsClaude Opus 4.x lineage.

Frequently asked

Can a fine-tuned open model fake a frontier model?

Within a paragraph, yes. Over a thousand words, no — tokenizer artefacts and refusal-policy mismatches surface.

Are these tells stable across versions?

Within a family, yes. Across major version bumps, no — Claude 3 to 4 changed pacing materially, GPT-4 to 5 changed everything.

From the Almanac shop

Model Tells — Flashcard Deck

Identify any frontier model from a paragraph of output. 60 cards.

$14Coming soon

All identification topics