What does the LoRA rank r control?

The adapter's capacity (how much change it can express)

What does QLoRA add over plain LoRA?

It quantizes the frozen base (e.g. to 4-bit) to save memory

Which is a sensible minimal target-modules choice?

The attention query/value projections

Track 1 · SFT fundamentals · Lesson 14

LoRA knobs: rank, alpha, dropout, target modules, QLoRA

After this lesson you can configure a LoRA fine-tune: choose the rank and alpha, pick target modules, set dropout, and decide when QLoRA is worth it.

Level: beginner Read time: ~9 min Prerequisites: Full fine-tuning vs LoRA

LoRA has a handful of settings. You can get a good result with defaults, but understanding each knob lets you trade capacity, regularization, and memory deliberately.

Rank (r): the adapter's capacity

The rank r is the inner dimension of LoRA's two small matrices — effectively how much "room" the adapter has to express a change. Small r (e.g. 8) is cheap and often plenty for a narrow task; larger r (16, 32, 64) adds capacity for harder or broader tasks at the cost of more adapter parameters and memory. More is not always better — too much rank on a small dataset just gives the model more room to overfit. A common starting point is r = 16.

Alpha: how strongly the adapter speaks

alpha scales the adapter's contribution before it's added to the frozen weights (the effective scale is roughly alpha / r). A frequent convention is to set alpha = 2 × r (e.g. r=16, alpha=32). If you change r, people often scale alpha with it to keep the effective strength similar. Treat the pair together rather than tuning each blindly.

Target modules: where to put adapters

You choose which weight matrices get an adapter — the target modules. The attention projections (the query/key/value/output matrices from Track 0) are the usual targets; adapting the feed-forward layers too increases capacity and cost. A minimal, common choice is the query and value projections; "all linear layers" is the more thorough, heavier option. Start minimal and expand only if quality demands it.

Dropout: light regularization

LoRA dropout randomly zeroes some of the adapter's activations during training, a mild regularizer that helps avoid overfitting on small datasets. A small value like 0.05 is typical; raise it slightly if you see overfitting, lower it toward 0 for larger datasets.

QLoRA: LoRA on a quantized base

QLoRA shrinks memory further by quantizing the frozen base model — storing its weights in 4 bits instead of 16 — while still training a normal LoRA adapter on top. Since the base is frozen anyway, quantizing it costs little quality but roughly quarters the memory the weights occupy, letting you fine-tune a model that wouldn't otherwise fit. The trade is slightly slower steps and a small quality risk. Reach for QLoRA when memory is the binding constraint.

Sensible starting point

r = 16, alpha = 32, dropout 0.05, target the attention projections, full-precision base if it fits (QLoRA if not). Change one thing at a time and measure on the gold set.

These knobs spend memory. To set them — and the batch size — without crashing, you need to estimate the GPU memory a run will use, which is the next lesson.

Key terms

LoRA rank (r): The adapter's inner dimension / capacity; higher = more room (and more overfit risk). Common: 16.
LoRA alpha: Scales the adapter's contribution (~alpha/r); often set to 2×r.
Target modules: Which weight matrices get adapters (commonly the attention query/value projections).
LoRA dropout: Mild regularization zeroing some adapter activations during training (e.g. 0.05).
QLoRA: LoRA on a 4-bit quantized frozen base, to fit fine-tuning into far less memory.
Quantization: Storing weights in fewer bits (e.g. 4) to save memory.

Check yourself

Answers are saved to this browser.

Progress is stored locally in your browser.

Rank (r): the adapter's capacity

Alpha: how strongly the adapter speaks

Target modules: where to put adapters

Dropout: light regularization

QLoRA: LoRA on a quantized base

Key terms

Check yourself

Related lessons