-
Notifications
You must be signed in to change notification settings - Fork 40.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DRA: structured parameters #123516
DRA: structured parameters #123516
Conversation
Skipping CI for Draft Pull Request. |
16b2228
to
76c97b5
Compare
This adds support for semantic version comparison to the CEL support in the "named resources" structured parameter model. For example, it can be used to check that an instance supports a certain API level. To minimize the risk, the new "semver" type is only defined in the CEL environment for DRA expressions, not in the base library. See kubernetes#123664 for a PR which adds it to the base library. Validation of semver strings is done with the regular expression from semver.org. The actual evaluation at runtime then uses semver/v4.
While currently those objects only get published by the kubelet for node-local resources, this could change once we also support network-attached resources. Dropping the "Node" prefix enables such a future extension. The NodeName in ResourceSlice and StructuredResourceHandle then becomes optional. The kubelet still needs to provide one and it must match its own node name, otherwise it doesn"t have permission to access ResourceSlice objects.
This should better run with multiple nodes, it"s more realistic that way.
Storing a modified claim with allocation and the original resource version in the assume cache was not reliable: if an update was received, it replaced the modified claim and the resource that was reserved for the claim might have been used for some other claim. To fix this, the in-flight claims are now stored in the map instead of just a boolean and the status stored there overrides whatever is in the assume cache. Logging got extended to diagnose this problem better. It started to occur in E2E tests after splitting the claim update so that first the finalizer is set and then the status, because setting the finalizer triggered an update.
This finishes the shuffling around of test scenarios so that all of them which make sense with structured parameters are also executed with those.
There are two approaches for making new versioned CEL features available in the release where they get introduced: - Always use the environment for "StoredExpressions". - Use an older version (typically 1.0) and only bump it up later. The second approach was used before, so this is now also done here.
9b453d1
to
6a361e1
Compare
I"ll approve for API, who hold the kubelet LGTM? /approve |
@pohly: The following tests failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
@klueska is okay with merging, see #123516 (comment). The wording isn"t exactly LGTM, but I know that he doesn"t want any further changes in this PR as far as kubelet is concerned. |
but that wasn"t sufficient:
Can one of the @kubernetes/milestone-maintainers add the milestone? The exception has been granted. |
/milestone v1.30 |
LGTM label has been added. Git tree hash: df4d9cba0ea326b3ede6b6bc49a86e4fdcc62ca0
|
/skip |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: alculquicondor, dims, klueska, pohly, soltysh, thockin The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
"resource": "resourceslices", | ||
"responseKind": { | ||
"group": "", | ||
"kind": "ResourceSlice", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we update the KEP to capture the name change and other deviations?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, absolutely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What type of PR is this?
/kind feature
What this PR does / why we need it:
"Structured parameters" are a new approach for handling scheduling and autoscaling with dynamic resource allocation: drivers publish information in a format defined and understood by Kubernetes, the scheduler handles the claim allocation without communicating with some DRA driver controller. The Cluster Autoscaler can use the information to simulate scale up and down.
Which issue(s) this PR fixes:
Related-to: kubernetes/enhancements#4381
Special notes for your reviewer:
To keep the PR manageable, the new functionality gets introduced in stages with one commit building on top of the previous:
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: