Skip to main content
Data synthesis recipes provide a quick starting point for creating datasets tailored to specific configurations or use cases. Each recipe defines how to generate a custom dataset, making it easy to reuse and adapt across projects.

What is a data synthesis recipe?

A data synthesis recipe is effectively a synthesis configuration for generating data. Rather than writing static examples by hand, you define structured rules that describe:
  • What should vary across samples
  • How examples should be generated
  • Which model should perform generation
  • How many samples to create
You can think of a recipe as a data generation blueprint. Attributes define the ingredients, templates define the instructions, and the synthesis engine produces consistent, scalable outputs by systematically combining them. This approach allows you to:
  • Generate large, diverse datasets from compact specifications
  • Systematically target failure modes identified during evaluation
  • Maintain consistency across generated examples
  • Iterate quickly by adjusting attributes instead of rewriting data
By separating structure (attributes and templates) from scale (number of samples), recipes make synthetic data generation controllable, reproducible, and aligned with measurable training goals.

Accessing & executing data synthesis recipes

You can access and manage all your data synthesis recipes in one location:
  1. Go to the Recipes page.
  2. Filter on data synthesis recipes by selecting Synthesis in the Type drop-down menu.
  3. Click the recipe name to open it in the Builder.
  4. Click the Execute button to run the recipe and start the synthesis job.
Your newly synthesized dataset will appear under datasets once the job is finished.