Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Scheduling Priority and Preemption #763

Open
abstractmj opened this issue Nov 7, 2023 · 4 comments
Open

Support Scheduling Priority and Preemption #763

abstractmj opened this issue Nov 7, 2023 · 4 comments
Assignees
Labels
kind/feature New feature or request
Milestone

Comments

@abstractmj
Copy link
Contributor

What would you like to be added:

Support Scheduling Priority and Preemption

Why is this needed:

To further improve the packing rate and resource utilization of the cluster, we usually divide tasks into high-priority tasks and low-priority tasks, and deploy them in a mixed manner. When the cluster is idle, low-priority tasks can make full use of cluster resources, while high-priority tasks can preempt low-priority task resources to meet the needs of high-priority tasks. In a single Kubernetes cluster, the community provides the pod priority feature to implement the scheduling priority of pods, and also implements preemptive scheduling of pods in the kube-scheduler based on priority.

When we move the application deployment process to a multi-cluster level, such scenarios still exist. However, in the current Clusternet multi-cluster application management mechanism, no similar capabilities are provided.

The entire requirement can be divided into two parts: "scheduling priority" and "preemptive scheduling".

"Scheduling Priority" implementation:

Add a scheduling priority field in the subscription
Add a priority queue in the scheduler. After the scheduler listens to the subscription changes, it puts them into the priority queue and dequeues them according to priority.

"preemptive scheduling"

There are two implementation ideas for "preemptive scheduling":

Idea 1: Implement subscription preemption;
  • step1: Reuse the "scheduling priority" implementation;
  • step2: Extend the preemption logic based on the existing clusternet-scheduler provided scheduler framework;
Idea 2: Rely on single-cluster preemption logic;
  • step1: Enhance the clusternet-scheduler's resource prediction process's perception of pod priority, It is necessary to supplement pod priority as scheduling information in feedinventory;
  • step2: Enhance the cross-cluster rescheduling capability of low-priority pods;

btw: I prefer idea 1

@abstractmj abstractmj added the kind/feature New feature or request label Nov 7, 2023
@dixudx
Copy link
Member

dixudx commented Nov 8, 2023

@abstractmj Nice proposal. This is exactly what we've planned to do.

  • For scheduling priority implementation, the design looks good to me. I'd suggest splitting them into smaller tasks, such as adding api, priority queue, etc.

  • For preemptive scheduling, I agree with you on the first design.

    While the second one looks more like workload re-balancing. Since v0.16.0, Clusternet has introduced a new feature gate FailOver to help migrate workloads from not-ready clusters to healthy spare clusters. And I think we could make an enhancement on top of FailOver feature to support such workload re-balancing.

@abstractmj
Copy link
Contributor Author

Thank you for your suggestion.

As you mentioned, the second design indeed seems more like a rebalance process. I haven't learned much about FailOver before, so I will study it to see if it can replace the descheduler logic I implemented previously.

Following the idea of the first design, the priority scheduling is relatively simple. I think it can be divided into the following two tasks: (1) add a priority field in the subscription, and (2) add a priority queue in the scheduler. On this basis, we can further extend the priority queue and scheduler plugin to implement preemptive scheduling logic. If possible, I can provide a more detailed design of the preemptive scheduling logic.

@dixudx
Copy link
Member

dixudx commented Nov 8, 2023

@abstractmj Looks good to me. Please go ahead.

@dixudx
Copy link
Member

dixudx commented Nov 17, 2023

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants