Module 14

Parameter-Efficient Fine-Tuning (PEFT)

Part IV: Training & Adapting LLMs

Chapter Overview

Full fine-tuning of a 7B parameter model requires 56 GB of memory just for the weights in FP16, plus optimizer states that can triple that figure. For most practitioners, this puts full fine-tuning out of reach without expensive multi-GPU setups. Parameter-efficient fine-tuning (PEFT) methods solve this problem by training only a tiny fraction of parameters (often less than 1%) while achieving quality that rivals or matches full fine-tuning.

This module covers the most important PEFT techniques in depth, starting with LoRA and QLoRA (the dominant methods in practice) and extending to newer approaches like DoRA, LoRA+, and adapter-based methods. You will learn not just the theory behind each method, but also how to configure hyperparameters, select target modules, and merge adapters for efficient deployment.

The final section surveys the rapidly evolving ecosystem of training platforms and tools, from Unsloth (which delivers 2x speedups with half the memory) to managed platforms like Axolotl and LLaMA-Factory. By the end of this module, you will be able to fine-tune any open-weight model on a single consumer GPU.

Learning Objectives

Prerequisites

Sections