Skip to content

LLM Primer (1 page)

What is an LLM? (one sentence)

A Large Language Model is software that predicts the next word by learning patterns from billions of examples.


Key Terms

TermDefinition
TokenA piece of text (word, punctuation, subword). Models process tokens, not letters.
ParameterA “dial” the model uses to predict. More parameters = more nuance.
ContextThe conversation history the model can see (usually 4K–400K tokens).
HallucinationWhen a model generates plausible-sounding but false information.
TemperatureControls randomness (0 = deterministic, 1.0 = creative).
Top-p samplingLimits predictions to the most likely options for quality control.

May 2026 Model Comparison

ModelBest ForContextCostTraining Data
Claude Opus 4.8Writing, reasoning, analysis1M tokens5/5/25 per 1M input/outputJan 2026
GPT-5.5Speed, all-purpose1M tokens5/5/30 per 1MDec 2025
Gemini 3.1 ProLong documents, research1M tokens2/2/12 per 1M (free tier available)Jan 2025
Claude Sonnet 4.6Balanced, coding1M tokens3/3/15 per 1MJan 2026
DeepSeek V4Cost-conscious teams256K tokens10–50x cheaperLate 2024

Real Capabilities vs Hype

✅ What LLMs Actually Do Well

  • Writing and editing (often better than most people)
  • Explaining concepts (clear, accessible)
  • Coding assistance (saves significant time)
  • Analysis of existing text (pattern finding, extraction)
  • Brainstorming and ideation (fast, creative)

⚠️ Improving (But Still Imperfect)

  • Math (reasoning models like o3 are better, but still limited)
  • Factual accuracy (hallucinations remain; add retrieval to fix)
  • Real-time information (not built-in; Perplexity/Gemini add it)
  • Domain expertise (fine-tuning helps)

❌ Can’t Do (Yet)

  • Access the internet on their own
  • Know anything after their training date
  • Reliably do complex math
  • Understand context beyond 200K tokens
  • Guarantee truthfulness (they generate plausible text, not facts)
  • Replace domain experts (they’re tools for experts)

3 Ways to Adapt Models

1. Prompting

Give better instructions. Works for most tasks.

  • Time: Instant
  • Cost: Free (uses existing model)
  • When to use: Writing, analysis, coding, learning

2. Retrieval-Augmented Generation (RAG)

Feed the model your own data so it can answer questions about your documents.

  • Time: Hours to days (build once, use forever)
  • Cost: Cheap (just vector DB storage)
  • When to use: Customer support, internal docs, proprietary knowledge

3. Fine-Tuning

Train the model on your examples so it learns your style/domain.

  • Time: Days (one-time)
  • Cost: Moderate (100100–10K depending on scale)
  • When to use: Specialized domains, specific format/tone, repeated use at scale

Common Misconceptions

  • “LLMs understand language” → They pattern-match. Understanding is human projection.
  • “LLMs are general intelligences” → They’re narrow: good at text, bad at reasoning under uncertainty.
  • “ChatGPT is always right” → No. Verify important facts. They hallucinate.
  • “LLMs will replace humans” → No. They’re tools. Humans + LLMs > either alone.
  • “Training new models is cheap” → No. Billions in compute. Fine-tuning is accessible.

Next Steps