Reinforcement learning (RL) support is coming soon. To request early access, please contact us at https://www.oumi.ai/contact.
What is fine-tuning?
Fine-tuning builds on a model that has already been trained, adapting it to perform better on specific tasks. Rather than starting from randomly initialized weights, you begin with a model that has learned general language patterns from large-scale data and refine it for your particular use case. In a typical Oumi workflow, training is driven by evidence. You start with evaluation and failure mode analysis, identify targeted areas for improvement, curate or synthesize focused data, and then fine-tune. Each training run is intentional and designed to produce measurable, incremental performance gains. This approach allows you to adapt the model to:- A specific domain
- A task (e.g., classification, instruction following)
- A style or response format
- Known failure modes identified during evaluation
Supervised fine-tuning (SFT)
Oumi supports supervised SFT, the most common approach for adapting LLMs. In SFT, the model learns to make desired responses more likely given specific prompts. For example: Prompt: “What is the capital of France?”Target response: “Paris.” During training, the model adjusts its parameters to increase the probability of generating the target response when given that prompt. SFT works best when your training dataset is:
- High-quality
- Cleanly formatted
- Closely aligned with your target use case
Full-weight vs. parameter-efficient training
You can use Oumi to fine-tune a model using either full-weight fine-tuning or parameter-efficient training.Full-Weight Fine-Tuning (FFT)
FFT updates the model’s parameters directly, often all of them. This approach offers maximum flexibility and capacity for change but requires more compute and memory. Choose FFT when:- You need significant behavioral shifts
- You have sufficient compute resources
- You want maximal adaptation
Parameter-Efficient Fine-Tuning (PEFT)
PEFT freezes the original model weights and instead trains a smaller set of additional parameters that modify the model’s behavior. This approach:- Uses less compute
- Is faster to train
- Produces smaller artifacts
Training workflow
Once datasets are prepared, your Oumi training workflow will typically entail:- Configuring a training run (choosing a base model, training method, and datasets)
- Launching the run and monitor progress
- Evaluating the resulting model
- Diagnosing failure modes and iterating