Prompt Engineering
Advanced Patterns & Best Practices for Better LLM Outputs
Core Patterns
| Pattern | When to Use | What to Do |
| Zero-shot | Simple, well-defined tasks | Just ask directly. |
| Few-shot | Strict output format needed | Show 2–5 examples before the real query. |
| Chain-of-Thought | Reasoning, math, multi-step | Ask to "think step by step" first. |
| ReAct | Agent/tool-using flows | Interleave Thought → Action → Observation. |
| Reflection | Editing/improving output | Ask model to critique, then revise. |
Reusable Template
# Role
You are a [ROLE] helping [AUDIENCE] with [TASK].
# Context
[background, docs, constraints]
# Task
[exactly what you want]
# Output format
[structure: sections, JSON keys, etc.]
# Rules
- Do not [avoid this]
- If unsure, say "I don't know"
Anti-Patterns (Stop Doing These)
- Stuffing system prompt with every rule. Keep it < 400 tokens.
- Using "Please" as a substitute for clarity. Be direct and specific.
- Asking for JSON without a schema example. You'll get creative keys.
- Re-reading full conversation every turn. Summarize & prune instead.
- Vague constraints like "be helpful." State explicit dos and don'ts.
Quick Wins
- Use few-shot examples — Most reliable way to fix output format.
- Add "think step by step" — Improves reasoning on complex tasks.
- Break into subtasks — Easier than one big ask.
- Specify output structure — JSON schema, bullet lists, tables.
- Iterate, don't perfect — Start simple, refine based on results.
Token Cost Formula
For self-consistency sampling (n completions, average m tokens output):
cost = n × (input_tokens × price_in + m × price_out)
Example: 100 input tokens, 10 samples × 50 output tokens with Claude Sonnet = 100×$3 + 500×$15 = $7.80 total.