Track 0 · Foundations · Lesson 7

Pretraining vs fine-tuning vs prompting vs RAG

After this lesson you can name the four levers for making a model do what you want, say what each one changes and costs, and choose the right one (or combination) for a given problem.

Level: beginner Read time: ~10 min Prerequisites: How language models work

Suppose a base model isn't doing what you need — wrong format, missing knowledge, wrong tone. You have four levers, and they differ on one crucial axis: do you change the model's parameters, or only what you feed it? Knowing which lever fits which problem saves enormous time and money. This course is about one of them (fine-tuning), but you should know all four to know when fine-tuning is the right call.

Lever 1: Prompting (change the input)

The cheapest lever: leave the model untouched and write a better input. Give instructions, show a few examples right in the prompt (this is in-context learning), specify the format you want. Zero training, instant iteration, no infrastructure.

Limits: you're bounded by the base model's existing ability and by the context window — every example you paste in is tokens you pay for on every call, and the model "forgets" it the moment the call ends. Great for prototyping and for tasks the base model can already mostly do.

Lever 2: RAG (supply knowledge at inference)

Retrieval-augmented generation keeps the model fixed but, at question time, retrieves relevant documents from your own data and pastes them into the prompt. The model then answers using that supplied context. RAG is the right tool when the problem is knowledge: facts the model never saw, private documents, or information that changes often (you update the document store, not the model).

It changes nothing about the model's behavior or format — it only changes what the model knows for that one call. Get the retrieval wrong and the answer is wrong; get it right and even a small model can answer questions about your latest docs.

Lever 3: Fine-tuning / SFT (change the parameters on examples)

Supervised fine-tuning continues training the base model on examples of your inputs paired with the outputs you want. Unlike prompting and RAG, this actually changes the parameters, so the new behavior is baked in — no per-call example tokens, consistent format, learned style and task shape. It's the right lever when the problem is behavior or form: "always output strict JSON," "classify into these labels," "extract spans in this schema," "respond in this voice."

It costs a training run (a GPU and some examples), and it does not, by itself, teach the model fresh facts reliably — that's RAG's job. The headline of this whole Academy: with a modest dataset, SFT can make a small model excellent at one specific task.

Lever 4: (Continued) pretraining (build base ability)

The heaviest lever: train on a very large corpus of raw text to build broad capability or deep domain fluency (e.g., adapting a general model to legal or biomedical language). This is what creates base models in the first place. Continued pretraining extends a base model on a big domain corpus before any task-specific fine-tuning. It's expensive — lots of data, lots of compute — and rarely the first thing you reach for. Most practitioners never need it; they start from someone else's pretrained base and fine-tune.

leaves model unchanged changes parameters acts at inference acts in advance Prompting better input; in-context examples RAG retrieve docs into the prompt — (n/a) Fine-tuning · Pretraining train on examples / large corpora
Two axes: does it change the parameters, and does it act in advance or at inference time? RAG adds knowledge at call time; fine-tuning bakes behavior in beforehand.

A decision guide

These combine. A common production stack is RAG plus a fine-tuned model: fine-tune so the model reliably uses retrieved context and outputs your format, and use RAG to supply the facts. BrewSLM treats both as first-class — but the skill this Academy builds, the one that most changes what a small model can do, is fine-tuning.

Key idea

Prompting and RAG steer a fixed model by changing its input; fine-tuning and pretraining change the model itself. Use RAG for knowledge, fine-tuning for behavior — and don't reach for the expensive levers when a cheaper one solves it.

Where we go next

You now know why you'd fine-tune. The last Foundations lesson asks a money question: given that fine-tuning a small model is so effective for a narrow task, when should you pick a small model over a giant one at all?

Key terms

Prompting
Steering a fixed model by writing a better input; no training.
In-context learning
Showing examples inside the prompt so the model imitates them for that call.
RAG (retrieval-augmented generation)
Retrieving relevant documents at inference and adding them to the prompt; supplies knowledge, not behavior.
Fine-tuning / SFT
Continuing training on your input→output examples; changes parameters to bake in behavior/format.
Continued pretraining
Training on a large domain corpus to build broad ability; expensive, rarely the first choice.

Check yourself

Four questions. Answers are saved to this browser.

Progress is stored locally in your browser.