Training config reference: the brewslm.yaml manifest
After this reference you can read a brewslm.yaml manifest section by section, identify which knob controls which behaviour, and know which fields are safe to edit by hand versus generated.
Track 3's narrative lessons described what BrewSLM does. This reference describes how a project is configured at the schema level. The canonical artifact is brewslm.yaml — a project-as-code manifest the platform reads, validates, diffs, and applies. Every training run is the result of resolving this file against the project's data. Knowing its sections gives you the keys to the platform.
Manifest envelope
Every brewslm.yaml opens with a fixed envelope:
api_version: brewslm/v1
kind: Project
metadata:
name: my-project
description: "..."
labels: { team: "ml", env: "dev" }
spec:
# ten sections, all optional, all defaulted
The schema is strict-extra-forbid: unknown keys fail validation. Typos and stale fields are caught at apply time, not as silent no-ops. Schema drift is therefore visible by design.
spec.workflow — project-level toggles
Cross-cutting settings that don't belong to any single stage.
workflow:
beginner_mode: false
pipeline_stage: ingestion
target_profile_id: vllm_server
training_preferred_plan_profile: balanced # safe | balanced | max_quality
gate_policy: {}
budget_settings: {}
training_preferred_plan_profileis the hyperparameter preset for the next launch. Three values:safe(conservative),balanced(default; right for most projects),max_quality(longer training, more aggressive knobs).target_profile_iddefines the export/deploy target shape —vllm_server,cpu_local, etc.beginner_modemakes Coach Mode more proactive and disables a few destructive UI actions.
spec.blueprint — domain blueprint
The "what is this project actually for" section. Optional, but if you fill it in, Coach suggestions and eval-pack auto-gates use the structured intent.
blueprint:
domain_name: "Customer support"
problem_statement: "Classify intent of inbound tickets"
target_user_persona: "Tier-1 agents"
task_family: instruction_sft
input_modality: text
expected_output_schema: { ... }
expected_output_examples: [ ... ]
safety_compliance_notes: []
deployment_target_constraints: { ... }
success_metrics:
- { metric_id: macro_f1, label: "Intent macro F1", target: "min 0.90" }
glossary:
- { term: "P1", plain_language: "Highest-priority ticket", category: "ops" }
confidence_score: 0.0
unresolved_assumptions: []
The blueprint is round-tripped, not executed. It anchors reviews ("did the model actually do what we said it should?") and feeds the eval-pack auto-suggestions.
spec.domain — domain pack + profile
domain:
pack_id: customer-support
profile_id: ticket-intent
A domain pack ships pre-built defaults (synth playbooks, eval prompts, blueprint hints) for a vertical. Profiles narrow further. Both are pointers; the platform resolves them on apply.
spec.model — base model selection
model:
base_model: HuggingFaceTB/SmolLM2-135M-Instruct
cache_fingerprint: "sha256:..."
source_ref: "hf:..."
registry_id: 42
Only base_model is authoritative. Everything else is informational and re-derived on apply. Changing base_model in the manifest is a real change — apply will note it and your next training run will start from a different base.
spec.data_sources — registered datasets
data_sources:
- name: tickets-train
type: jsonl
description: "Q3 labeled tickets"
record_count: 4812
file_path: data/projects/3/sources/tickets-train.jsonl
metadata: { schema_hash: "..." }
versions:
- { version: 1, record_count: 4812, file_path: "..." }
Per-source versions entries let manifest apply detect "the underlying file changed" vs "a new source was added" — distinguishing resumable training from "you need to re-import."
spec.adapters — Adapter Studio entries
adapters:
- name: ticket-extractor
version: 1
status: active
base_adapter_id: default-canonical
task_profile: instruction_sft
source_type: raw
source_ref: "..."
field_mapping: { input: question, output: answer }
adapter_config: { ... }
output_contract: { ... }
Adapters are the data-transformation layer between raw rows and the training-ready shape. The schema is intentionally open (adapter_config + output_contract are free-form dicts) because the adapter class is what defines the contract.
spec.training_plan — what the trainer actually executes
The section most directly tied to "training" — the recipe and the resolved hyperparameters.
training_plan:
training_mode: sft # sft | kd
plan_profile: balanced # safe | balanced | max_quality
preferred_runtime_id: simulate # simulate | external | cloud_burst | ...
config: { ... } # flat dict — the resolved TrainingConfig
training_modeselects the trainer.sftis the default (Track 2 by-hand pipeline).kdselects the Knowledge Distillation trainer (Track 4, Lesson 4.3 — the alpha*CE + (1-alpha)*T^2*KL loss against captured teacher logprobs).plan_profileis the hyperparameter preset. Maps to a concreteTrainingConfigwith learning rate, batch size, epochs, LoRA knobs — the same knobs you set by hand in Track 2 Lesson 2.5.preferred_runtime_idpicks where the run lands.simulatefor tests,externalfor the subprocess runtime.configis a flat dict — the actual resolved hyperparameters. Manifest validation checks this against theTrainingConfigschema at apply time.
Where the actual hyperparameter values live
The manifest carries plan_profile as a name; the platform resolves that to a real TrainingConfig dict (learning rate, batch size, epochs, LoRA r/alpha/dropout, target modules, scheduler, warmup ratio, precision). The dict ends up in training_plan.config at apply time. Edit the dict to override individual fields; change plan_profile to flip the whole preset. The profile values themselves evolve with the recipe library — treat the named profile as the stable contract, the resolved config as the snapshot.
spec.eval_pack — what counts as "good"
eval_pack:
pack_id: evalpack.general.default
datasets: [gold_dev, gold_test]
eval_types: [exact_match, f1, hallucination, safety]
extra: {}
The pack referenced here declares the gates (Lesson 3.13's reference). datasets names the held-out sets that get evaluated; eval_types selects which scorers run. Default datasets are gold_dev + gold_test — the platform's standard names for the dev/test splits.
spec.export — artifact production
export:
formats: [gguf, safetensors]
quantization: Q4_K_M
extra: {}
Mapping straight onto Track 4 Lesson 4.6: pick the formats you'll deploy. Multiple formats can be produced from one trained model in a single export stage.
spec.deployment — endpoint target hints
deployment:
target_profile_id: vllm_server
extra: {}
The target_profile_id here overrides the project-level default in workflow.target_profile_id for the deploy stage. extra carries serving-stack-specific knobs (vLLM tensor parallelism, llama.cpp threads, etc.).
What apply does with all of this
The manifest apply service takes the parsed brewslm.yaml plus the project's current state and produces a ManifestApplyPlan — a list of ManifestApplyAction rows, each declaring a target (project / blueprint / data_source / adapter / training_plan), an operation (create / update / noop / delete), the before/after, and the fields that changed. Apply never silently drops a field — every actionable diff is reported, every unknown key is rejected, every conflict surfaced as a warning.
Key idea
brewslm.yaml is the catalog of every knob a project has. Read it section by section and you have read the platform's training contract. The schema is versioned (api_version: brewslm/v1) and strict (unknown keys are errors) — when the contract evolves, the next apply tells you. Edit by hand for surgical overrides; let the platform regenerate for ordinary changes.
The next reference catalogues the audit-spine surfaces that watch what apply triggers: the RunEvent taxonomy and the Coach Mode action catalogue.
Key terms
brewslm.yaml- The project-as-code manifest. Versioned (
brewslm/v1), strict-extra-forbid, validates / diffs / applies against project state. - Manifest spec sections
- The ten sections of
spec:workflow,blueprint,domain,model,data_sources,adapters,training_plan,eval_pack,export,deployment. - Training plan profile
- Named hyperparameter preset —
safe/balanced/max_quality— that resolves to a concreteTrainingConfigat apply time. - Training mode
training_plan.training_mode:sftfor supervised fine-tuning,kdfor knowledge distillation (Track 4).- Eval pack id
- Registered pack reference like
evalpack.general.default; the pack declares the gates (Lesson 3.13). - ManifestApplyPlan
- The before/after action list produced by manifest apply — every create / update / noop / delete is declared, every unknown key rejected.
Check yourself
Answers are saved to this browser.