Managing Batch Jobs

This chapter covers how to manage existing jobs in Lepton. For information on creating a new job, refer to Creating a Lepton Job.

Listing Jobs

To list all existing jobs, use the following command:

lep job list

This command returns details for each job, including the job name, creation time, and status (e.g., Starting, Ready, Updating).

Checking Job Status

To view the details of a specific job, run:

lep job get -i job-id

This command provides comprehensive information about the job, including metadata and the status of the job.

Viewing Replicas

To view the replica id of a job, use:

lep job replicas -i job-id

Viewing Logs

To view logs and monitor the current state or troubleshoot a job, use:

lep job log -i job-id

This command retrieves the logs for all replicas within the deployment. To view the logs for a specific replica, run:

lep job log -i job-id -r mypy-0-w925p

Viewing Events

To review the history of events related to failures, and other significant actions, run:

lep job events -i job-id

This command returns event details, including event types, reasons, associated replica IDs, and timestamps.

Lepton AI

© 2024