TRAINING

OVERVIEW

A training recipe captures a complete model training configuration (base model, datasets, training method, and hyperparameters) as a reusable JSON template. Running a recipe produces a consistent, reproducible training job without requiring you to reconfigure inputs each time. Training recipes are saved from the Oumi Builder and accessed from the Recipes page. For details on saving and running a recipe from the UI, see Training Recipes. For the full schema reference, see Model Training Recipe Schema.

COMMON RECIPE PATTERNS

SUPERVISED FINE-TUNING WITH LORA (PEFT)

The most common starting point. LoRA reduces GPU memory requirements and trains faster than full fine-tuning, making it ideal for rapid iteration.

{
  "model": {
    "modelName": "meta-llama/Llama-3.1-8B-Instruct",
    "torchDtype": "bfloat16",
    "modelMaxLength": 4096
  },
  "data": {
    "train": {
      "datasets": [{ "datasetId": 101 }]
    },
    "validation": {
      "datasets": [{ "datasetId": 102 }]
    }
  },
  "training": {
    "trainerType": "sft",
    "numTrainEpochs": 3,
    "learningRate": 0.0002,
    "lrSchedulerType": "cosine",
    "warmupRatio": 0.05,
    "maxGradNorm": 1.0,
    "mixedPrecisionDtype": "bf16",
    "enableGradientCheckpointing": true,
    "evalStrategy": "epoch",
    "saveFinalModel": true,
    "runName": "llama3-8b-lora-v1"
  },
  "peft": {
    "usePeft": true,
    "peftMethod": "lora"
  }
}

When to use: Most fine-tuning tasks, especially when iterating quickly or working with limited GPU memory.

FULL-WEIGHT FINE-TUNING (FFT)

Updates all model parameters. Use when you need deep behavioral changes or have sufficient compute available.

{
  "model": {
    "modelName": "Qwen/Qwen3-8B",
    "torchDtype": "bfloat16",
    "modelMaxLength": 8192,
    "attnImplementation": "flash_attention_2"
  },
  "data": {
    "train": {
      "datasets": [{ "datasetId": 201 }]
    },
    "validation": {
      "datasets": [{ "datasetId": 202 }]
    },
    "test": {
      "datasets": [{ "datasetId": 203 }]
    }
  },
  "training": {
    "trainerType": "sft",
    "numTrainEpochs": 2,
    "learningRate": 0.00005,
    "lrSchedulerType": "linear",
    "warmupSteps": 100,
    "weightDecay": 0.01,
    "mixedPrecisionDtype": "bf16",
    "enableGradientCheckpointing": true,
    "evalStrategy": "steps",
    "evalSteps": 500,
    "saveSteps": 500,
    "saveFinalModel": true,
    "enableWandb": true,
    "runName": "qwen3-8b-fft-v1"
  },
  "peft": {
    "usePeft": false,
    "peftMethod": "lora"
  }
}

When to use: When LoRA quality is insufficient, or you’re making substantial domain adaptation changes.

ON-POLICY DISTILLATION

Trains a smaller student model guided by a stronger teacher. Requires trainerType: "opd". See On-Policy Distillation for configuration details.

{
  "model": {
    "modelName": "HuggingFaceTB/SmolLM2-1.7B-Instruct",
    "torchDtype": "bfloat16",
    "modelMaxLength": 2048
  },
  "data": {
    "train": {
      "datasets": [{ "datasetId": 301 }]
    }
  },
  "training": {
    "trainerType": "opd",
    "numTrainEpochs": 3,
    "learningRate": 0.0001,
    "lrSchedulerType": "cosine",
    "warmupRatio": 0.1,
    "mixedPrecisionDtype": "bf16",
    "saveFinalModel": true,
    "runName": "smollm-distilled-v1"
  },
  "peft": {
    "usePeft": true,
    "peftMethod": "lora"
  }
}

When to use: When you want a smaller, faster model that approximates a larger teacher’s behavior on a specific task.

TIPS

Start with LoRA: switch to FFT only if LoRA quality is insufficient for your task.
Use evalStrategy: "epoch" for small datasets; use "steps" with a reasonable evalSteps for large ones.
Set runName: descriptive names make it easier to compare runs in the activity log.
Use inferenceSeed for reproducible results across recipe runs.
Enable enableGradientCheckpointing when GPU memory is constrained; it slows training slightly but allows larger batch sizes.

Resources

Recipe configs

Recipe examples

OVERVIEW

COMMON RECIPE PATTERNS

SUPERVISED FINE-TUNING WITH LORA (PEFT)

FULL-WEIGHT FINE-TUNING (FFT)

ON-POLICY DISTILLATION

TIPS

​OVERVIEW

​COMMON RECIPE PATTERNS

​SUPERVISED FINE-TUNING WITH LORA (PEFT)

​FULL-WEIGHT FINE-TUNING (FFT)

​ON-POLICY DISTILLATION

​TIPS

OVERVIEW

COMMON RECIPE PATTERNS

SUPERVISED FINE-TUNING WITH LORA (PEFT)

FULL-WEIGHT FINE-TUNING (FFT)

ON-POLICY DISTILLATION

TIPS