Track 1 · SFT fundamentals · Lesson 1

What is Supervised Fine-Tuning?

After this lesson you can define SFT precisely, say what changes versus the base model, and decide when SFT is the right tool versus prompting, RAG, or a bigger model.

Level: beginner Read time: ~9 min Prerequisites: Architecture taxonomy

Track 0 gave you the machine and the four levers for steering it. Track 1 zooms in on one lever — Supervised Fine-Tuning — and makes every part of it precise. We start with the definition and, crucially, with when not to reach for it.

SFT in one sentence

Supervised Fine-Tuning continues training a pretrained model on a curated set of (prompt, completion) examples — inputs paired with the exact outputs you want — so the model learns to produce those outputs. "Supervised" means every example carries a known target; the model is corrected toward it. Mechanically it is the same next-token training from Track 0, run on your examples instead of raw web text.

What actually changes

SFT shifts the model's parameters so that, for inputs like yours, the next-token distribution favors the responses you demonstrated. You are not adding a module or a rulebook — you are nudging the same weights you met in Lesson 1 of Track 0. The new behavior is baked in: no per-call examples, consistent format, learned tone and task shape.

A note on starting points. A base model is raw next-token prediction; an instruct model has already been fine-tuned to follow instructions and chat. You can SFT either, but starting from an instruct model usually gets you to a polished, on-format result with less data.

Key idea

SFT changes behavior, not knowledge. It is excellent at teaching a model how to respond — format, style, task — and unreliable at teaching it new facts. For facts that change or are private, reach for RAG (Track 0, Lesson 7).

What SFT is good at

When NOT to use SFT

How much data? There is no universal number, but for a narrow task a few hundred to a few thousand clean examples is a realistic starting range — and quality matters more than raw count. The rest of this track is about getting that data right and running the training that turns it into a better model.

Key terms

Supervised Fine-Tuning (SFT)
Continuing training on labeled (prompt, completion) examples to change a model's behavior for a task.
(prompt, completion) pair
One SFT example: an input and the target output you want the model to produce.
Base vs instruct model
A base model is raw next-token prediction; an instruct model is already tuned to follow instructions.
Behavior vs knowledge
SFT reshapes how a model responds (behavior); it does not reliably add new facts (knowledge — use RAG).

Check yourself

Answers are saved to this browser.

Progress is stored locally in your browser.