First-Class Training Pipeline

BrewSLM pipeline: raw data to shipped SLM

  1. 01 Ingest HF / Kaggle / URL / local
  2. 02 Adapter Map schema to canonical records
  3. 03 Contract Check task/data compatibility
  4. 04 Model Select v2 introspection + memory fit
  5. 05 Preflight runtime/dependency guardrails
  6. 06 Train + Benchmark local or cloud burst
  7. 07 Export + Serve artifact sync + deployment

Built for Engineers

Make SLM development look like a stable software pipeline, not a notebook maze.

BrewSLM is a full stack for data contracts, capability checks, benchmark-driven model selection, multimodal runtime paths, and managed cloud burst jobs. You can stay in guided mode or drop into advanced controls at any stage.

  • Reproducible steps, explicit contracts, fewer silent failures
  • Works with unstructured, tabular, and instruction-style data
  • One-click path for speed, deep controls for specialists

Core Product

Everything needed to ship practical SLM systems

Universal Data Ingestion

Bring local files, Hugging Face, Kaggle, or URL sources. Normalize with adapters, map fields safely, and validate shape contracts before training.

Preflight + Capability Contracts

Catch incompatibilities early across model architecture, task type, runtime modality, dependencies, and dataset fit.

Model Selection v2

Combine curated defaults with model introspection metadata and benchmark context to make safer base-model picks.

Real Benchmark Mode

Run short benchmark jobs on sampled project data, persist results, and feed ranking decisions back into recommendations.

Multimodal Runtime Path

Text-first by default, with beta pathways for vision-language and audio-text training plus strict media preflight checks.

Cloud Burst Jobs

Managed lifecycle for remote training: submit, status, cancel, logs, and artifact sync to keep workflows consistent across local and cloud.

Product Flow

From plain intent to exportable model in five steps

01

Describe goal in plain language

BrewSLM interprets intent, suggests rewrites, and builds a safe starter plan.

02

Connect and normalize data

Adapters map diverse schemas into training-ready records with quality checks.

03

Run preflight and benchmark

Validate capability contracts and compare candidate models on your sample.

04

Launch training locally or cloud burst

Track logs, checkpoints, and metrics in one consistent control surface.

05

Export and serve

Package artifacts for deployment paths and iterate quickly with grounded feedback.

FAQ

Questions teams ask before adopting BrewSLM

Is BrewSLM only for text datasets?

No. Text is the default path, but multimodal adapters and runtime checks support vision-language and audio-text workflows.

Can beginners use this without deep ML expertise?

Yes. The Wizard maps plain intent to safe presets, guardrails block risky launches, and one-click runs handle most setup complexity.

Do I lose control if I use autopilot?

Not at all. You can inspect the generated config JSON, override any setting, and move to advanced panels at any step.

Quickstart

Get a full pipeline run locally in minutes

Use this baseline sequence to spin up backend, frontend, and worker, then launch your first training run through the Wizard or Advanced panel.

$ git clone <your-repo> __SLM__

$ cd __SLM__ && docker-compose up -d

$ cd backend && .venv/bin/celery -A app.worker worker -l info

$ cd frontend && npm run dev

Default flow emphasizes safety over raw speed.

Use `config_name` for multi-config HF datasets.

Use preflight before long jobs or cloud burst runs.

Run BrewSLM Pipeline