What does get_peft_model(model, lora_config) do?

Wraps the base with LoRA adapters and freezes the base

What does Trainer.train() actually run?

The gradient-descent loop: forward, loss, backward, optimizer step

Track 2 · Hands-on · Lesson 5

A minimal LoRA fine-tune with the Trainer

After this lesson you can wrap a model in LoRA, configure TrainingArguments, run the Trainer to fine-tune, and save the adapter — the core of a by-hand SFT run.

Level: intermediate Read time: ~10 min Prerequisites: Tokenize and collate: model-ready batches with a loss mask

Everything so far has been setup. This is the training itself — and it's shorter than you might expect, because the Trainer implements the loop from Track 1 (forward, loss, backward, optimizer step) for us. We add LoRA so we update a tiny adapter instead of all 135M parameters.

Attach the LoRA adapter

Recall Track 1's LoRA knobs: rank r, alpha, dropout, and which modules to adapt. We freeze the base and wrap it:

from peft import LoraConfig, get_peft_model

lora_config = LoraConfig(
    r=16,
    lora_alpha=32,
    lora_dropout=0.05,
    target_modules=["q_proj", "v_proj"],   # the attention query/value projections
    task_type="CAUSAL_LM",
)

model = get_peft_model(model, lora_config)
model.print_trainable_parameters()
# e.g. trainable params: ~0.4M || all params: ~135M || trainable%: ~0.3

That print_trainable_parameters() line is worth pausing on: you're training well under 1% of the model. This is exactly why LoRA fits on a small GPU (Track 1's memory lesson).

Configure the run

The hyperparameters are the ones you studied in Track 1 — learning rate, epochs, effective batch size (per-device batch × gradient accumulation), bf16, and how often to log/save.

from transformers import TrainingArguments, Trainer

args = TrainingArguments(
    output_dir="sft-out",
    per_device_train_batch_size=8,
    gradient_accumulation_steps=2,        # effective batch = 16
    learning_rate=2e-4,                   # LoRA tolerates a higher LR
    num_train_epochs=3,
    lr_scheduler_type="cosine",
    warmup_ratio=0.03,
    bf16=True,
    logging_steps=5,
    eval_strategy="epoch",                # older transformers: evaluation_strategy
    save_strategy="epoch",
    report_to=[],                         # no external loggers
)

Train

trainer = Trainer(
    model=model,
    args=args,
    train_dataset=train_tok,
    eval_dataset=val_tok,
    data_collator=collator,
    processing_class=tok,                 # older transformers: tokenizer=tok
)

trainer.train()
model.save_pretrained("sft-out/adapter")  # saves just the small LoRA adapter
tok.save_pretrained("sft-out/adapter")

Key idea

The entire fine-tune is: wrap in LoRA → set TrainingArguments → Trainer.train(). The Trainer runs the gradient-descent loop you learned in Track 1; you supply the data, the collator, and the hyperparameters. save_pretrained writes only the adapter — a few megabytes, not a model copy.

What just happened

The Trainer iterated your training set for 3 epochs, computing cross-entropy on the unmasked completion tokens, backpropagating into the LoRA adapter, and stepping the optimizer with a cosine-decayed learning rate after a short warmup. At each epoch it ran a validation pass. The result is a small adapter that, applied to the frozen base, produces your task's behavior. Next we'll run it and read the logs to judge whether it went well.

Key terms

LoraConfig: peft config for LoRA: rank r, alpha, dropout, target_modules, task_type.
get_peft_model: Wraps a base model with LoRA adapters, freezing the base.
print_trainable_parameters: Reports how few parameters LoRA actually trains.
TrainingArguments: Holds the hyperparameters: LR, epochs, batch, precision, logging/saving.
Trainer: Runs the training loop (forward/loss/backward/step) for you.
save_pretrained: Writes the model — for a LoRA model, just the small adapter.

Check yourself

Answers are saved to this browser.

Progress is stored locally in your browser.