Llama 3.1 8B

128K context

Description

Llama 3.1 8B is best suited for limited computational power and resources. The model excels at text summarization, text classification, sentiment analysis, and language translation requiring low-latency inferencing.

Pricing

Dedicated Endpoints: Calculated by the instance type and the number of GPUs, you can find the details in pricing page. You can also contact us to reserve GPUs.
Serverless Endpoints: $0.07 / M tokens for using Llama 3.1 8B, pay as you go.

Create a Dedicated Endpoint

Beyond the serverless endpoints, Lepton provides a simple way to create a dedicated endpoint for Llama 3.1 8B, which is a fully managed endpoint for your own use cases. If this model is what you are looking for, head over our dashboard to create your endpoint.

Llama 3.1 8B

Description

Pricing

Create a Dedicated Endpoint

Playground

API Reference