On-demand price is $0.5 / million tokens.
Get more details

And use it in the following codes.

import os
import openai

api_token = os.environ.get('LEPTON_API_TOKEN')
client = openai.OpenAI(
  base_url="https://dolphin-mixtral-8x7b.lepton.run/api/v1/",
  api_key=api_token
)

response = client.completions.create(
  model="dolphin-mixtral-8x7b",
  prompt="<|im_start|>user\n# Python\ndef fibonacci(n):<|im_end|>\n<|im_start|>assistant"
)

print(response)

The rate limit for the Serverless Endpoints is 10 requests per minute across all models under Basic Plan. If you need a higher rate limit with SLA please upgrade to standard plan, or use dedicated deployment.

Lepton AI

© 2024