LLM Primer

What Large Language Models Are & How They Work

What is an LLM?

A neural network that predicts the next word in a sentence.

Input: Text

Process: Predict next token (word/subword)

Output: Text

It's not magic—it's pattern matching on billions of examples.

Key Terms

Token: ~4 characters

Parameter: A "weight" in the network (billions of them)

Context: Max tokens readable at once

Temperature: Randomness control (0=deterministic, 2=creative)

Hallucination: Confident false output

May 2026 Models

ModelBest For
Claude Opus 4.7Complex reasoning
GPT-5.5Speed & balance
Gemini 3.1Long documents
DeepSeek V4Cost-conscious

Real Capabilities

  • ✓ Write & edit
  • ✓ Explain concepts
  • ✓ Analyze documents
  • ✓ Generate code
  • ✗ Access internet
  • ✗ Understand truth
  • ✗ Reliable math

How Training Works

Step 1: Read trillions of text tokens

Step 2: Predict next token billions of times

Step 3: Adjust weights when predictions are wrong

Step 4: Repeat until accurate

Cost: $50M+ | Time: 6+ months

3 Ways to Adapt

Prompting: Craft input (free, instant)

RAG: Add external docs ($, instant)

Fine-tune: Retrain on data ($$$, days)

Start with prompting. Use RAG for facts. Fine-tune for behavior.

Quick Decision Tree

Need facts? → Use RAG

Want different tone? → Fine-tune

Trying a new task? → Prompt engineering

Stuck? → Add examples (few-shot learning)