Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.oumi.ai/llms.txt

Use this file to discover all available pages before exploring further.

OVERVIEW

Once you’ve generated training data, fine-tuned a model, and evaluated its performance, the final step is deployment. Hosted inference lets you take a model trained on the Oumi Platform and serve it as a live API endpoint, making it available for real-time use.
The Deployments feature is currently in beta.
Each deployed model is assigned its own dedicated endpoint for inference, and you retain full control over its lifecycle, with the ability to create or remove deployments as needed.

ACCESSING DEPLOYMENTS

To deploy a model, you’ll first need a trained model in your project to enable hosted inference.
  • From the top of the Models page, click on the Deploy Model button; alternatively, click on the + Create Deployment button from the Deployments page.
  • On the Deploy Model modal window, select either Custom Oumi Model or External Model:

Custom Oumi Model

  • Provide a unique Deployment Name.
  • Select a Model from the drop-down.
  • Click Start → to deploy your model.

External Model

  • Provide a unique Deployment Name.
  • Select a Provider from the drop-down.
  • Select your External Model from the drop-down.
  • Insert the API key for your provider.
  • Click Start → to deploy your model.