Universal Data Ingestion
Bring local files, Hugging Face, Kaggle, or URL sources. Normalize with adapters, map fields safely, and validate shape contracts before training.
First-Class Training Pipeline
Built for Engineers
BrewSLM is a full stack for data contracts, capability checks, benchmark-driven model selection, multimodal runtime paths, and managed cloud burst jobs. You can stay in guided mode or drop into advanced controls at any stage.
Core Product
Bring local files, Hugging Face, Kaggle, or URL sources. Normalize with adapters, map fields safely, and validate shape contracts before training.
Catch incompatibilities early across model architecture, task type, runtime modality, dependencies, and dataset fit.
Combine curated defaults with model introspection metadata and benchmark context to make safer base-model picks.
Run short benchmark jobs on sampled project data, persist results, and feed ranking decisions back into recommendations.
Text-first by default, with beta pathways for vision-language and audio-text training plus strict media preflight checks.
Managed lifecycle for remote training: submit, status, cancel, logs, and artifact sync to keep workflows consistent across local and cloud.
Product Flow
BrewSLM interprets intent, suggests rewrites, and builds a safe starter plan.
Adapters map diverse schemas into training-ready records with quality checks.
Validate capability contracts and compare candidate models on your sample.
Track logs, checkpoints, and metrics in one consistent control surface.
Package artifacts for deployment paths and iterate quickly with grounded feedback.
FAQ
No. Text is the default path, but multimodal adapters and runtime checks support vision-language and audio-text workflows.
Yes. The Wizard maps plain intent to safe presets, guardrails block risky launches, and one-click runs handle most setup complexity.
Not at all. You can inspect the generated config JSON, override any setting, and move to advanced panels at any step.
Quickstart
Use this baseline sequence to spin up backend, frontend, and worker, then launch your first training run through the Wizard or Advanced panel.
$ git clone <your-repo> __SLM__
$ cd __SLM__ && docker-compose up -d
$ cd backend && .venv/bin/celery -A app.worker worker -l info
$ cd frontend && npm run dev
Default flow emphasizes safety over raw speed.
Use `config_name` for multi-config HF datasets.
Use preflight before long jobs or cloud burst runs.
Run BrewSLM Pipeline