CRAFT framework (review)

The seven letters, as stated
Annotated against the evidence
CRAFT vs. Role + Task + Constraints
Is CRAFT better than other methods?
When CRAFT helps anyway
Related

CRAFT is a seven-letter mnemonic for writing one-shot prompts, popularized by Alexander F. Young: Context, Request, Actions, Frame, Template, plus optional Example and Develop (source). It belongs to a wider genre of consumer-facing prompt acronyms — COSTAR, RACE, RISEN, CLEAR — that all wrap roughly the same primitives.

This page reviews CRAFT against the evidence the rest of this site is built on. The short version: the behaviours CRAFT asks you to perform (be specific, give a format, show examples) are individually well-supported. The wrapper is a memory aid. CRAFT’s weakest letter is its first — Context-as-Persona — which leans on a pattern the published evidence doesn’t support.

The seven letters, as stated

Faithful summary of Young’s post, no editorialising:

C — Context. Adopt a persona, character, or role. Specify tone and audience. Young uses a sub-mnemonic PAT (Persona / Audience / Tone) and gives examples like “You are a children’s book writer writing for children aged 10–18 in the style of JK Rowling.”
R — Request. A single clearly defined task or goal.
A — Actions. A numbered list of instructions or steps to complete the request.
F — Frame. Constraints: what to include, what to exclude. E.g. “Only respond with the chapters. Do not include any other text.”
T — Template. Output format — bullet list, table, code block, headings.
E — Example. (Optional) Few-shot demonstrations of the desired output shape.
D — Develop. (Optional) Iterate on the prompt and the response.

Annotated against the evidence

CRAFT letter	Maps to (this site)	Evidence verdict
C — Context (Role / Persona / PAT)	Role + task + constraints §Role	Partial. Role as scope and tone is supported. Persona-as-expertise gives no consistent accuracy gain (Zheng 2024) — see directives §6 for the full study and citation.
R — Request	Task · Directives §1	Supported. Specificity is the single most reliable lever — prompt-format variation alone swings accuracy by up to 76 pp on LLaMA-2-13B classification (Sclar et al. 2023, arXiv:2310.11324).
A — Actions	Directives §1 · §2	Supported. A numbered step list is a specificity device. Prefer bare imperatives (“indent with tabs”) over softened ones (“you should indent with tabs”).
F — Frame (constraints)	Constraints · Directives §3	Partial. Constraints are supported. Young’s examples lean heavily on negation — “do not include any other text” — and models systematically underperform on negation by 20–40 pp vs. affirmative phrasing (Truong et al. 2023, arXiv:2306.08189). Pair every prohibition with the positive alternative.
T — Template (format)	Directives §4 · §7	Supported. Format-shaping is one of the most reliable interventions. For machine-consumed output, prefer native structured-output / JSON-schema over a freeform template.
E — Example (optional)	Directives §4	Supported, with nuance. Few-shot gained 10–30 pp on many benchmarks at GPT-3 scale (Brown et al. 2020, arXiv:2005.14165). Min et al. 2022 (arXiv:2202.12837) then showed label correctness barely matters on GPT-3 — what carries the gain is format, label space, and input distribution. Examples teach shape, not facts.
D — Develop (optional)	(no direct site page)	Trivially true. “Iterate” is good advice but not a framework.

CRAFT vs. Role + Task + Constraints

CRAFT’s five core letters compress into the site’s three-part frame:

C            → Role          (with the persona caveat)
R + A        → Task          (request + the steps to perform it)
F            → Constraints
T (+ E)      → Output / examples — orthogonal; lives in directives, not the frame
D            → Iteration loop — not part of any single prompt

The three-part frame drops the C-as-PAT idea on purpose. Persona (in the “you are JK Rowling” sense) gets demoted to scope and tone; audience is folded into either scope or constraints; the expertise claim is dropped entirely. Five fewer words, one fewer trap.

CRAFT’s Template and Example are not in the three-part frame because they aren’t part of the durable frame — they’re per-task output shaping. They live in directives that work instead.

Is CRAFT better than other methods?

There is no published head-to-head benchmark of acronym frameworks. The components inside CRAFT are independently studied; the wrapper is decorative.

What is measured, across the components CRAFT touches:

Specificity swings accuracy by up to 76 pp on classification (Sclar 2023).
Persona expertise claims show no consistent gain across 162 personas (Zheng 2024).
Negation phrasing drops performance 20–40 pp vs. affirmative (Truong 2023).
Few-shot for format gains 10–30 pp at GPT-3 scale; label correctness barely matters (Brown 2020, Min 2022).
Format / template variation alone produces substantial accuracy swings (same Sclar 2023 finding).

None of these were measured as CRAFT vs. X. They were measured as individual interventions. Adopting CRAFT, COSTAR, RACE, or RISEN gives you the same evidence-supported levers in different orders with different cover art. The wrapper choice is essentially free.

Where CRAFT slightly misleads:

First-letter primacy. Putting Context-as-Persona first signals it’s the most important step. The Zheng 2024 result says it isn’t.
PAT sub-mnemonic. Bundling Persona + Audience + Tone hides that two of the three (audience, tone) are stylistic cues that the evidence supports, while the third (persona-as-expertise) is the one that doesn’t.
Negation-heavy Frame examples. The Frame examples in the original post are largely negative (“do not include any other text”). Strong constraint phrasing pairs each prohibition with a positive alternative.

Verdict: CRAFT is a serviceable beginner checklist for one-shot prompts. Its gains, where real, come from the specific behaviours it asks you to perform — and those behaviours work under any acronym. For durable context (CLAUDE.md, AGENTS.md, system prompts), the three-part frame maps more cleanly onto how prompt caching and the directive evidence actually work, and avoids the persona trap.

When CRAFT helps anyway

One-shot prompts in a chat UI. A 30-second mnemonic is genuinely useful when there’s no system prompt and no project context to lean on.
Teaching beginners. “Did you specify the output format?” is a higher-value question than “Did you tune your KV-cache prefix?” — meet people where they are.
Pre-flight checklist. Run a prompt through C/R/A/F/T as a quick audit before sending; if any letter is empty, the prompt is probably under-specified.

What CRAFT doesn’t replace: layered, cache-stable, durable context. See layered context and context is a budget for that side of the problem.

Role + task + constraints — the three-part frame CRAFT compresses into
Directives that work — empirical evidence on specificity, negation, few-shot, persona
Failure modes §5 — aspirational rules — the wider family the “world-class expert” pattern belongs to
Negative examples — the pairing rule CRAFT’s Frame examples miss