Guide · Roadmap

Small language model roadmap for ML engineers

This page is for new and experienced ML or LLM engineers who want to go from “I know the basics” to “I can fine-tune a base model on custom data, evaluate it honestly, and ship a useful domain-specific small language model.”

Who this roadmap is for

Different backgrounds need different emphasis. Pick the path that sounds most like you, then follow the linked lessons.

What you should know before touching a base model

These are the concepts that prevent expensive guesswork later.

The roadmap, in order

Each track exists for a specific stage of engineer maturity. The cleanest path is usually straight through.

1. Foundations

Build the vocabulary and mental models behind small language models, tokens, attention, base vs instruct, and the SLM trade-offs.

Open foundations

2. SFT fundamentals

Learn supervised fine-tuning, loss masks, chat templates, dataset quality, LoRA, GPU memory, evaluation, and when not to fine-tune.

Open SFT fundamentals

3. Hands-on

Load a base model, build a custom dataset, train with LoRA or QLoRA, evaluate honestly, and ship a runnable artifact in Python.

Open hands-on track

4. With BrewSLM

Move from script-level understanding into a platform workflow for ingestion, preflight, training jobs, eval gates, export, and deployment.

Open BrewSLM track

5. Advanced

Go beyond one SFT run with distillation, DPO/ORPO, quantization, multi-task training, serving, drift detection, and production feedback loops.

Open advanced track

Decision guide: SFT vs RAG vs CPT vs preference tuning

Most teams do not need every technique. Use the simplest tool that matches the actual problem.

Use SFT when

You need the model to respond in a better format, tone, task shape, or narrow behavior on data you can label clearly.

Use RAG when

You need fresh or private facts at inference time and the problem is retrieval, not permanently changing the model's behavior.

Use continued pretraining when

The base model does not speak the language of your domain well enough at the token level and SFT alone cannot close the gap.

Use DPO or ORPO when

You already have reasonable outputs and need the model to prefer better responses over worse ones using chosen/rejected data.

What to do next

Start learning

Open the Academy overview if you want the whole curriculum with progress tracking and per-track lesson lists.

Academy overview

Start building

If you already know enough theory, jump back to the product and run the workflow on your own custom data.

BrewSLM quickstart

Keep a reference nearby

The glossary is useful when you want a quick definition without leaving the learning path.

Glossary of terms