Track 2 · Hands-on · Lesson 5

A minimal LoRA fine-tune with the Trainer

After this lesson you can wrap a model in LoRA, configure TrainingArguments, run the Trainer to fine-tune, and save the adapter — the core of a by-hand SFT run.

Level: intermediate Read time: ~10 min Prerequisites: Tokenize and collate: model-ready batches with a loss mask

Everything so far has been setup. This is the training itself — and it's shorter than you might expect, because the Trainer implements the loop from Track 1 (forward, loss, backward, optimizer step) for us. We add LoRA so we update a tiny adapter instead of all 135M parameters.

Attach the LoRA adapter

Recall Track 1's LoRA knobs: rank r, alpha, dropout, and which modules to adapt. We freeze the base and wrap it:

from peft import LoraConfig, get_peft_model

lora_config = LoraConfig(
    r=16,
    lora_alpha=32,
    lora_dropout=0.05,
    target_modules=["q_proj", "v_proj"],   # the attention query/value projections
    task_type="CAUSAL_LM",
)

model = get_peft_model(model, lora_config)
model.print_trainable_parameters()
# e.g. trainable params: ~0.4M || all params: ~135M || trainable%: ~0.3

That print_trainable_parameters() line is worth pausing on: you're training well under 1% of the model. This is exactly why LoRA fits on a small GPU (Track 1's memory lesson).

Configure the run

The hyperparameters are the ones you studied in Track 1 — learning rate, epochs, effective batch size (per-device batch × gradient accumulation), bf16, and how often to log/save.

from transformers import TrainingArguments, Trainer

args = TrainingArguments(
    output_dir="sft-out",
    per_device_train_batch_size=8,
    gradient_accumulation_steps=2,        # effective batch = 16
    learning_rate=2e-4,                   # LoRA tolerates a higher LR
    num_train_epochs=3,
    lr_scheduler_type="cosine",
    warmup_ratio=0.03,
    bf16=True,
    logging_steps=5,
    eval_strategy="epoch",                # older transformers: evaluation_strategy
    save_strategy="epoch",
    report_to=[],                         # no external loggers
)

Train

trainer = Trainer(
    model=model,
    args=args,
    train_dataset=train_tok,
    eval_dataset=val_tok,
    data_collator=collator,
    processing_class=tok,                 # older transformers: tokenizer=tok
)

trainer.train()
model.save_pretrained("sft-out/adapter")  # saves just the small LoRA adapter
tok.save_pretrained("sft-out/adapter")

Key idea

The entire fine-tune is: wrap in LoRA → set TrainingArguments → Trainer.train(). The Trainer runs the gradient-descent loop you learned in Track 1; you supply the data, the collator, and the hyperparameters. save_pretrained writes only the adapter — a few megabytes, not a model copy.

What just happened

The Trainer iterated your training set for 3 epochs, computing cross-entropy on the unmasked completion tokens, backpropagating into the LoRA adapter, and stepping the optimizer with a cosine-decayed learning rate after a short warmup. At each epoch it ran a validation pass. The result is a small adapter that, applied to the frozen base, produces your task's behavior. Next we'll run it and read the logs to judge whether it went well.

Key terms

LoraConfig
peft config for LoRA: rank r, alpha, dropout, target_modules, task_type.
get_peft_model
Wraps a base model with LoRA adapters, freezing the base.
print_trainable_parameters
Reports how few parameters LoRA actually trains.
TrainingArguments
Holds the hyperparameters: LR, epochs, batch, precision, logging/saving.
Trainer
Runs the training loop (forward/loss/backward/step) for you.
save_pretrained
Writes the model — for a LoRA model, just the small adapter.

Check yourself

Answers are saved to this browser.

Progress is stored locally in your browser.