A neural network that predicts the next word in a sentence.
Input: Text
Process: Predict next token (word/subword)
Output: Text
It's not magic—it's pattern matching on billions of examples.
Token: ~4 characters
Parameter: A "weight" in the network (billions of them)
Context: Max tokens readable at once
Temperature: Randomness control (0=deterministic, 2=creative)
Hallucination: Confident false output
| Model | Best For |
|---|---|
| Claude Opus 4.7 | Complex reasoning |
| GPT-5.5 | Speed & balance |
| Gemini 3.1 | Long documents |
| DeepSeek V4 | Cost-conscious |
Step 1: Read trillions of text tokens
Step 2: Predict next token billions of times
Step 3: Adjust weights when predictions are wrong
Step 4: Repeat until accurate
Cost: $50M+ | Time: 6+ months
Prompting: Craft input (free, instant)
RAG: Add external docs ($, instant)
Fine-tune: Retrain on data ($$$, days)
Start with prompting. Use RAG for facts. Fine-tune for behavior.
Need facts? → Use RAG
Want different tone? → Fine-tune
Trying a new task? → Prompt engineering
Stuck? → Add examples (few-shot learning)