Creating a job

A job corresponds to an one-off task that runs to completion and then stops. This page will go through the basics of creating a job in Lepton. We will then cover the various configurable options available to you when creating a job: environment variables, secrets, file system mounts and more.

The Basics

You can create a job either from CLI or from the Dashboard, this page will go through the steps to create a job from CLI.

To create a job from a container image, we can use lep job create:

lep job create --name mypy \
    --container-image default/lepton:photon-py3.11-runner-0.21.0 \
    --command "echo 1; sleep 5" \
    --resource-shape "cpu.small"

This will create a job called mypy from the default/lepton:photon-py3.11-runner-0.21.0 container image and runs a simple bash command on a cpu.small instance that has one core and 4GB of memory. The job will be created with the default configuration, which is a single worker, and no environment variables or secrets.

You can specify the job using a json file and supply the configuration to the job creation:

lep job create -n mypy \
    -f ./lep_job_spec_mypy.json

with the json specification like this:

{
  "resource_shape": "cpu.small",
  "container": {
    "image": "default/lepton:photon-py3.11-runner-0.21.0",
    "command": [
      "/bin/bash",
      "-c",
      "echo 1; sleep 5"
    ]
  }
}

After the job is created, you can view its status with

lep job get -n mypy

Configurable Options

Number of workers

Number of workers to use for the job.

To create a job with 4 workers:

lep job create --name mypy \
    --container-image default/lepton:photon-py3.11-runner-0.21.0 \
    --command "echo 1; sleep 5" \
    --resource-shape "cpu.small" \
    --num-workers 4

Resource shapes

Resource shapes are the instance types that the job will be running on. The resource shape is specified with the --resource-shape flag:

lep job create --name mypy \
    --container-image default/lepton:photon-py3.11-runner-0.21.0 \
    --command "echo 1; sleep 5" \
    --resource-shape cpu.small

The common resource shapes are:

cpu.small, cpu.medium. cpu.large, gpu.a10, gpu.h100-sxm, gpu.2xh100-sxm, gpu.4xh100-sxm, gpu.8xh100-sxm

This is not a complete list. For enterprise users, you may have access to more resource shapes. You can contact Lepton support for more information.

Environment variables and secrets

Environment variables are key-value pairs that are passed to the job. They will be automatically set as environment variables in the job container, so the runtime can refer to them as needed.

To pass environment variables to a job, you can use the --env flag with the lep job create command. For example, to pass the environment variable "MYKEY1" with value "MYVALUE1", and "MYKEY2" with value "MYVALUE2", you can use:

lep job create --name mypy \
    --container-image default/lepton:photon-py3.11-runner-0.21.0 \
    --env MYKEY1=MYVALUE1 \
    --env MYKEY2=MYVALUE2 \
    --command "echo 1; sleep 5" \
    --resource-shape cpu.small

You can repeatedly use the --env flag to pass multiple environment variables.

Secret values are similar to environment variables, but their values are pre-stored in the platform so it is not exposed in the job environment. For example, you might want to keep your huggingface hub token as a secret on the lepton platform, and you can do this via:

lep secret create --name HF_TOKEN --value <your-huggingface-hub-token>

After this, you can pass in the secret value to the job using the --secret flag:

lep job create --name mypy \
    --container-image default/lepton:photon-py3.11-runner-0.21.0 \
    --secret HF_TOKEN \
    --command "echo 1; sleep 5" \
    --resource-shape cpu.small

You can also store multiple secret values, and specify which secret value to use with the --secret flag like the following:

lep secret create --name ALICE_HF_TOKEN --value <alice-s-huggingface-hub-token>
lep secret create --name BOB_HF_TOKEN --value <bob-s-huggingface-hub-token>
lep job create --name mypy \
    --container-image default/lepton:photon-py3.11-runner-0.21.0 \
    --command "echo 1; sleep 5" \
    --resource-shape cpu.small \
    --secret HF_TOKEN=ALICE_HF_TOKEN  # use Alice's token

Inside the job, the secret value will be available as an environment variable with the same name as the secret name. For example, in both cases above, the secret value will be available as HF_TOKEN and the value being the corresponding hf token value you stored.

Predefined environment variables: Your defined environment variables should not start with the name prefix LEPTON_, as this prefix is reserved for predefined env variables. The following environment variables are predefined and will be available in the job:

  • LEPTON_JOB_NAME: The name of the job
  • LEPTON_RESOURCE_ACCELERATOR_TYPE: The resource accelerator type of the job

File system mount

When you launch a job, you can mount a file system to the job. Lepton provides a serverless file system that is mounted to the job similar to a local POSIX file system, behaving much similar to an NFS volume. The filesystem is useful to store data files and models that are not included in the job image, or to persist files across jobs. To read more about the file system specifics, check out the File System documentation.

To mount a file system to a job, you can use the --mount flag with the lep job create command:

lep job create --name mypy \
    --container-image default/lepton:photon-py3.11-runner-0.21.0 \
    --mount /:/leptonfs \
    --command "echo 1; sleep 5" \
    --resource-shape cpu.small

This will mount the root of the lepton file system (/) to the job, and are accessible at /leptonfs in the job container. You can operate on the file system as if it is a local file system, and the files are persisted across jobs.

Make sure that you are not mounting the file system as system folders in the job, such as /etc, /usr, or /var. Also make sure that the mounted path does not already exist in the container image. Both cases may cause conflicts with the guest operating system. Lepton will make a best effort to prevent you from mounting the file system to these folders, and we recommend you double check the mounted path.

Other supported configurations

The detail descriptions for these options can be viewed by using lep job create -h or referring to Lepton CLI documentation.

  • --container-port
  • --max-failure-retry
  • --max-job-failure-retry
  • --image-pull-secrets
  • --intra-job-communication
  • --ttl-seconds-after-finished
  • --log-collection

Advanced Topics

Node groups

For enterprise users who have reserved resources on Lepton, you can specify the node group where the job will be launched. This can be done using the --node-group flag:

lep job create --name mypy \
    --container-image default/lepton:photon-py3.11-runner-0.21.0 \
    --node-group mynodegroup \
    --command "echo 1; sleep 5" \
    --resource-shape cpu.small

where mynodegroup is a node group your resources are reserved on. The job will be launched on the resources of the node group.

Examples

For job creation, job failure diagnosis and so on, you can refer to the following examples:

Distributed training with Pytorch
Job Failure Diagnose
Running jobs with conda environment
Lepton AI

© 2025