EXAMPLES

By interacting with the Oumi Agent, you can quickly create, evaluate, and train models with precision, regardless of where you are in the machine learning lifecycle. The following are just a few examples that highlight some of the tasks you can accomplish using simple natural language prompts.

Check out the Prompt Library for more examples of what you can do with the Oumi Agent.

TRAIN MODELS

Train a support ticket classifier

Fine-tune a model to automatically categorize customer support questions by topic and urgency.

Fine-tune a model to classify customer support questions by topic and urgency

Train an intent detection model

Fine-tune Qwen to identify the intent behind incoming customer support messages.

Fine-tune Qwen to detect the intent behind incoming customer support messages

Train a fraud detection model

Train a compact model for fraud detection using on-policy distillation techniques.

Train a compact model by distilling a large fraud detection model’s responses into a smaller model using on policy distillation.

BUILD & ANALYZE DATASETS

Create support tickets by urgency & topic

Create a structured dataset of customer support tickets labeled by both urgency and topic to support classification tasks.

Create a labeled dataset of customer support tickets categorized by urgency and topic

Generate Q&A pairs from documentation

Create 500 question-answer pairs from a product documentation PDF for training or evaluation use.

Generate 500 question-answer pairs from a product documentation PDF

Augment an existing dataset

Increase dataset size and diversity by generating new samples that follow the same style and format.

Expand my dataset by generating new samples that match the style and format of my existing examples

Expand dataset for evaluation coverage

Improve evaluation robustness by generating additional samples consistent with your existing dataset.

Expand my dataset by generating additional samples consistent with my existing dataset to improve my evaluation

Identify gaps in coding datasets

Analyze your dataset and generate new tasks with detailed solutions for addressing missing coverage.

Analyze my coding dataset and generate additional tasks with step-by-step solution breakdowns to fill gaps

EVALUATE MODELS

Create general-purpose evaluators

Define evaluators that score model outputs based on key criteria like helpfulness and accuracy.

Create evaluators to score my model's responses for helpfulness and accuracy

Evaluate customer support response quality

Build a targeted evaluator to assess how well your model performs on customer support interactions.

Create an evaluator to measure my model's response quality on customer support tickets

Analyze model failure patterns

Run evaluations to uncover common weaknesses and failure modes in your model’s responses.

Evaluate Qwen on a customer support dataset and surface the most common failure patterns

Test sentiment analysis performance

Evaluate model performance on sentiment classification using real-world product review data.

Test Qwen on a sentiment analysis task using customer food product reviews

WHAT’S NEXT

Build your first model by diving into the Quickstart and building your first custom machine learning model in Oumi.

Getting started

Oumi workflow

TRAIN MODELS

BUILD & ANALYZE DATASETS

EVALUATE MODELS

WHAT’S NEXT

​TRAIN MODELS

​BUILD & ANALYZE DATASETS

​EVALUATE MODELS

​WHAT’S NEXT

TRAIN MODELS

BUILD & ANALYZE DATASETS

EVALUATE MODELS

WHAT’S NEXT