Skip to main content
Oumi makes it easy for you to fine-tune a wide range of open-weight LLMs. The platform supports in both and configurations.
Reinforcement learning (RL) support is coming soon. To request early access, please contact us at https://www.oumi.ai/contact.

What is fine-tuning?

Fine-tuning builds on a model that has already been trained, adapting it to perform better on specific tasks. Rather than starting from randomly initialized weights, you begin with a model that has learned general language patterns from large-scale data and refine it for your particular use case. In a typical Oumi workflow, training is driven by evidence. You start with evaluation and failure mode analysis, identify targeted areas for improvement, curate or synthesize focused data, and then fine-tune. Each training run is intentional and designed to produce measurable, incremental performance gains. This approach allows you to adapt the model to:
  • A specific domain
  • A task (e.g., classification, instruction following)
  • A style or response format
  • Known failure modes identified during evaluation
Rather than teaching the model language from scratch, you are guiding and improving its behavior. Oumi provides all the tools and workflows to support this process.

Supervised fine-tuning (SFT)

Oumi supports supervised SFT, the most common approach for adapting LLMs. In SFT, the model learns to make desired responses more likely given specific prompts. For example: Prompt: “What is the capital of France?”
Target response: “Paris.”
During training, the model adjusts its parameters to increase the probability of generating the target response when given that prompt. SFT works best when your training dataset is:
  • High-quality
  • Cleanly formatted
  • Closely aligned with your target use case

Full-weight vs. parameter-efficient training

You can use Oumi to fine-tune a model using either full-weight fine-tuning or parameter-efficient training.

Full-Weight Fine-Tuning (FFT)

FFT updates the model’s parameters directly, often all of them. This approach offers maximum flexibility and capacity for change but requires more compute and memory. Choose FFT when:
  • You need significant behavioral shifts
  • You have sufficient compute resources
  • You want maximal adaptation

Parameter-Efficient Fine-Tuning (PEFT)

PEFT freezes the original model weights and instead trains a smaller set of additional parameters that modify the model’s behavior. This approach:
  • Uses less compute
  • Is faster to train
  • Produces smaller artifacts
A common method for PEFT is low-rank adaptation (LoRA), which introduces lightweight parameter updates that represent the difference from the frozen base model. PEFT is ideal for rapid iteration and resource-constrained environments.

Training workflow

Once datasets are prepared, your Oumi training workflow will typically entail:
  1. Configuring a training run (choosing a base model, training method, and datasets)
  2. Launching the run and monitor progress
  3. Evaluating the resulting model
  4. Diagnosing failure modes and iterating
After iterating through the workflow and reaching satisfactory model performance, you can export the trained model and deploy it into production. Bear in mind that training is part of a continuous improvement loop driven by evaluation and targeted data refinement, rather than a one-time event.