New ML engineers
You know Python, but not yet the mechanics of tokens, attention, loss, or why a base model behaves the way it does.
This page is for new and experienced ML or LLM engineers who want to go from “I know the basics” to “I can fine-tune a base model on custom data, evaluate it honestly, and ship a useful domain-specific small language model.”
Different backgrounds need different emphasis. Pick the path that sounds most like you, then follow the linked lessons.
You know Python, but not yet the mechanics of tokens, attention, loss, or why a base model behaves the way it does.
You already prompt and ship models, but want a tighter grasp on LoRA, dataset design, gold sets, and the trade-offs between SFT, RAG, CPT, and DPO.
You already know the theory and want the shortest path from custom data to a local, auditable workflow with repeatable evaluation.
These are the concepts that prevent expensive guesswork later.
Each track exists for a specific stage of engineer maturity. The cleanest path is usually straight through.
Build the vocabulary and mental models behind small language models, tokens, attention, base vs instruct, and the SLM trade-offs.
Open foundationsLearn supervised fine-tuning, loss masks, chat templates, dataset quality, LoRA, GPU memory, evaluation, and when not to fine-tune.
Open SFT fundamentalsLoad a base model, build a custom dataset, train with LoRA or QLoRA, evaluate honestly, and ship a runnable artifact in Python.
Open hands-on trackMove from script-level understanding into a platform workflow for ingestion, preflight, training jobs, eval gates, export, and deployment.
Open BrewSLM trackGo beyond one SFT run with distillation, DPO/ORPO, quantization, multi-task training, serving, drift detection, and production feedback loops.
Open advanced trackMost teams do not need every technique. Use the simplest tool that matches the actual problem.
You need the model to respond in a better format, tone, task shape, or narrow behavior on data you can label clearly.
You need fresh or private facts at inference time and the problem is retrieval, not permanently changing the model's behavior.
The base model does not speak the language of your domain well enough at the token level and SFT alone cannot close the gap.
You already have reasonable outputs and need the model to prefer better responses over worse ones using chosen/rejected data.
Open the Academy overview if you want the whole curriculum with progress tracking and per-track lesson lists.
Academy overviewIf you already know enough theory, jump back to the product and run the workflow on your own custom data.
BrewSLM quickstartThe glossary is useful when you want a quick definition without leaving the learning path.
Glossary of terms