Built-in & custom Evaluators
Oumi includes built-in evaluators (such as instruction following, safety, topic adherence, and truthfulness) to help you quickly establish baselines and gather early feedback. You can review, edit, and reuse these evaluators across evaluations, or create custom ones using the Builder to define the exact inputs your judge should consider. Alternatively, you can describe your desired evaluator in natural language with the Oumi Agent, specifying scoring criteria, selecting the evaluator model, and including additional dataset fields for context as needed.Custom evaluators are reusable and should focus on a single, clearly defined property to ensure consistent and reliable results.
What’s next
Defining Evaluators
Establish criteria for measuring model performance
Evaluator Recipers
Save and reuse evaluator configurations