Tasks
The following tasks show you how to perform operations based on the Kueue user personas such as batch administrators and batch users.
Batch administrator
A batch administrator manages the cluster infrastructure and establishes quotas and queues.
As a batch administrator, you can learn how to:
- Setup role-based access control to Kueue objects.
- Administer cluster quotas with ClusterQueues and LocalQueues.
- Setup All-or-nothing with ready Pods.
- As a batch administrator, you can learn how to monitor pending workloads.
- As a batch administrator, you can learn how to run a Kueue managed Jobs with a custom WorkloadPriority.
- As a batch administrator, you can learn how to setup a MultiKueue environment.
Batch user
A batch user runs workloads. A typical batch user is a researcher, AI/ML engineer, data scientist, among others.
As a batch user, you can learn how to:
- Run a Kueue managed batch/Job.
- Run a Kueue managed Flux MiniCluster.
- Run a Kueue managed Kubeflow Job. Kueue supports MPIJob v2beta1, PyTorchJob, TFJob, XGBoostJob, PaddleJob, and MXJob.
- Run a Kueue managed KubeRay RayJob.
- Run a Kueue managed KubeRay RayCluster.
- Submit Kueue jobs from Python.
- Run a Kueue managed plain Pod.
- Run a Kueue managed JobSet.
Serving user
A serving user runs workloads. A serving user runs serving workloads, for example, to expose a trained AI/ML model for inference.
As a serving user, you can learn how to:
Platform developer
A platform developer integrates Kueue with other software and/or contributes to Kueue.
As a platform developer, you can learn how to:
- Integrate a custom Job with Kueue.
- Enable pprof endpoints.
- Develop a custom AdmissionCheck Controller.
Troubleshooting
Sometimes things go wrong. You can follow the Troubleshooting guides to understand the state of the system.
Feedback
Was this page helpful?
Glad to hear it! Please tell us how we can improve.
Sorry to hear that. Please tell us how we can improve.