TRAIN MODEL

The Oumi Agent streamlines fine-tuning by analyzing your task definition and datasets, then recommending an appropriate model family, model size, and hyperparameter configuration. Combined with prebuilt training recipes, you start from a strong baseline rather than building configuration from scratch. What typically takes days of trial-and-error tuning can be reduced to a single guided workflow. The Oumi Agent eliminates most of the manual work that slows teams down, from model selection to hyperparameter setup, so engineering effort goes toward improving your model rather than configuring it. Because training runs are fully reproducible and tied to your evaluation results, each iteration is intentional and measurable rather than exploratory. Oumi supports in both and configurations.

Reinforcement learning (RL) support is coming soon. To request early access, please contact us at https://www.oumi.ai/contact.

WHAT IS FINE-TUNING?

Fine-tuning builds on a model that has already been trained, adapting it to perform better on specific tasks. Rather than starting from randomly initialized weights, you begin with a model that has learned general language patterns from large-scale data and refine it for your particular use case. In a typical Oumi workflow, training is driven by evidence. You start with evaluation and failure mode analysis, identify targeted areas for improvement, curate or synthesize focused data, and then fine-tune. Each training run is intentional and designed to produce measurable, incremental performance gains. This approach allows you to adapt the model to:

A specific domain
A task (e.g., classification, instruction following)
A style or response format
Known failure modes identified during evaluation

Rather than teaching the model language from scratch, you are guiding and improving its behavior. Oumi provides all the tools and workflows to support this process.

SUPERVISED FINE-TUNING (SFT)

Oumi supports supervised SFT, the most common approach for adapting LLMs. In SFT, the model learns to make desired responses more likely given specific prompts. For example: Prompt: “What is the capital of France?”
Target response: “Paris.” During training, the model adjusts its parameters to increase the probability of generating the target response when given that prompt. SFT works best when your training dataset is:

High-quality
Cleanly formatted
Closely aligned with your target use case

FULL-WEIGHT VS. PARAMETER-EFFICIENT TRAINING

You can use Oumi to fine-tune a model using either full-weight fine-tuning or parameter-efficient training.

FULL-WEIGHT FINE-TUNING (FFT)

FFT updates the model’s parameters directly, often all of them. This approach offers maximum flexibility and capacity for change but requires more compute and memory. Choose FFT when:

You need significant behavioral shifts
You have sufficient compute resources
You want maximal adaptation

PARAMETER-EFFICIENT FINE-TUNING (PEFT)

PEFT freezes the original model weights and instead trains a smaller set of additional parameters that modify the model’s behavior. This approach:

Uses less compute
Is faster to train
Produces smaller artifacts

A common method for PEFT is low-rank adaptation (LoRA), which introduces lightweight parameter updates that represent the difference from the frozen base model. PEFT is ideal for rapid iteration and resource-constrained environments.

TRAINING WORKFLOW

Once datasets are prepared, your Oumi training workflow will typically entail:

Configuring a training run (choosing a base model, training method, and datasets)
Launching the run and monitor progress
Evaluating the resulting model
Diagnosing failure modes and iterating

After iterating through the workflow and reaching satisfactory model performance, you can export the trained model and deploy it into production. Bear in mind that training is part of a continuous improvement loop driven by evaluation and targeted data refinement, rather than a one-time event.

​WHAT IS FINE-TUNING?

​SUPERVISED FINE-TUNING (SFT)

​FULL-WEIGHT VS. PARAMETER-EFFICIENT TRAINING

​FULL-WEIGHT FINE-TUNING (FFT)

​PARAMETER-EFFICIENT FINE-TUNING (PEFT)

​TRAINING WORKFLOW

WHAT IS FINE-TUNING?

SUPERVISED FINE-TUNING (SFT)

FULL-WEIGHT VS. PARAMETER-EFFICIENT TRAINING

FULL-WEIGHT FINE-TUNING (FFT)

PARAMETER-EFFICIENT FINE-TUNING (PEFT)

TRAINING WORKFLOW