Run an Evaluation.
Click on Judge-Based Evaluation from the Builder. Under the INPUTS tab, provide the following information:
- Model - A hosted or custom model for evaluation.
- Evaluators - One or more evaluators to score model outputs.
- Dataset - The dataset to evaluate against.
- Failure Mode Analysis (optional) - Whether to generate failure modes automatically
- Inference Configurations (optional) - Inference parameters like
Temperature,Max Tokens,Seed,Requests Per Minute.