Guide · Roadmap

Small language model roadmap for ML engineers

This page is for new and experienced ML or LLM engineers who want to go from “I know the basics” to “I can fine-tune a base model on custom data, evaluate it honestly, and ship a useful domain-specific small language model.”

Start with foundations Jump to SFT Academy overview

Who this roadmap is for

Different backgrounds need different emphasis. Pick the path that sounds most like you, then follow the linked lessons.

New ML engineers

You know Python, but not yet the mechanics of tokens, attention, loss, or why a base model behaves the way it does.

Experienced LLM engineers

You already prompt and ship models, but want a tighter grasp on LoRA, dataset design, gold sets, and the trade-offs between SFT, RAG, CPT, and DPO.

Teams building domain models

You already know the theory and want the shortest path from custom data to a local, auditable workflow with repeatable evaluation.

What you should know before touching a base model

These are the concepts that prevent expensive guesswork later.

How models work

How to choose the adaptation path

How to choose the starting checkpoint

The roadmap, in order

Each track exists for a specific stage of engineer maturity. The cleanest path is usually straight through.

1. Foundations

Build the vocabulary and mental models behind small language models, tokens, attention, base vs instruct, and the SLM trade-offs.

Open foundations

2. SFT fundamentals

Learn supervised fine-tuning, loss masks, chat templates, dataset quality, LoRA, GPU memory, evaluation, and when not to fine-tune.

Open SFT fundamentals

3. Hands-on

Load a base model, build a custom dataset, train with LoRA or QLoRA, evaluate honestly, and ship a runnable artifact in Python.

Open hands-on track

4. With BrewSLM

Move from script-level understanding into a platform workflow for ingestion, preflight, training jobs, eval gates, export, and deployment.

Open BrewSLM track

5. Advanced

Go beyond one SFT run with distillation, DPO/ORPO, quantization, multi-task training, serving, drift detection, and production feedback loops.

Open advanced track

Decision guide: SFT vs RAG vs CPT vs preference tuning

Most teams do not need every technique. Use the simplest tool that matches the actual problem.

Use SFT when

You need the model to respond in a better format, tone, task shape, or narrow behavior on data you can label clearly.

What is SFT?

Use RAG when

You need fresh or private facts at inference time and the problem is retrieval, not permanently changing the model's behavior.

Pretraining vs fine-tuning vs RAG

Use continued pretraining when

The base model does not speak the language of your domain well enough at the token level and SFT alone cannot close the gap.

Continued pretraining

Use DPO or ORPO when

You already have reasonable outputs and need the model to prefer better responses over worse ones using chosen/rejected data.

Preference tuning: DPO and ORPO

What to do next

Start learning

Open the Academy overview if you want the whole curriculum with progress tracking and per-track lesson lists.

Academy overview

Start building

If you already know enough theory, jump back to the product and run the workflow on your own custom data.

BrewSLM quickstart

Keep a reference nearby

The glossary is useful when you want a quick definition without leaving the learning path.

Glossary of terms