Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

assign -998 as the oom_score_adj for critical pods (e.g. kube-proxy) #39114

Merged
merged 2 commits into from
Dec 22, 2016

Conversation

dchen1107
Copy link
Member

@dchen1107 dchen1107 commented Dec 22, 2016

I also validated this with a testing cluster: Fresh built cluster, and kill kube-proxy pod, etc.

root      2660  2643  0 Dec21 ?        00:00:00 /bin/sh -c kube-proxy --master=https://104.198.79.64 --kubeconfig=/var/lib/kube-proxy/kubeconfig  --cluster-cidr=10.180.0.0/14 --resource-container="" --v=4   1>>/var/log/kube-proxy.log 2>&1
root      2667  2660  0 Dec21 ?        00:03:14 kube-proxy --master=https://104.198.79.64 --kubeconfig=/var/lib/kube-proxy/kubeconfig --cluster-cidr=10.180.0.0/14 --resource-container= --v=4
# cat /proc/2660/oom_score_adj 
-998
# cat /proc/2667/oom_score_adj 
-998

In this pr, I also include a small fix for import cycle issue. The right fix should remove the dependency on qos package from pkg/apis/componentconfig/v1alpha1. But since we plan to cherrypick this pr to both 1.5 and 1.4 (possible), I want touch the source as little as possible.

Partial fix: #38322

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Dec 22, 2016
@k8s-reviewable
Copy link

This change is Reviewable

@dchen1107 dchen1107 requested a review from bprashanth December 22, 2016 00:17
@k8s-ci-robot
Copy link
Contributor

Jenkins Bazel Build failed for commit 33e8dc1ee7b346ead1970cf8e48224833cad77ff. Full PR test history.

The magic incantation to run this job again is @k8s-bot bazel test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

@dchen1107 dchen1107 removed the request for review from bprashanth December 22, 2016 00:17
@dchen1107 dchen1107 added this to the v1.5 milestone Dec 22, 2016
@k8s-github-robot k8s-github-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. release-note-label-needed labels Dec 22, 2016
import cycle not allowed in test
package k8s.io/kubernetes/pkg/client/restclient (test)
	imports k8s.io/kubernetes/pkg/api/testapi
	imports k8s.io/kubernetes/pkg/apis/componentconfig/install
	imports k8s.io/kubernetes/pkg/apis/componentconfig/v1alpha1
	imports k8s.io/kubernetes/pkg/kubelet/qos
	imports k8s.io/kubernetes/pkg/kubelet/pod
	imports k8s.io/kubernetes/pkg/client/clientset_generated/clientset
	imports k8s.io/kubernetes/pkg/client/clientset_generated/clientset/typed/apps/v1beta1
	imports k8s.io/kubernetes/pkg/client/restclient
@dchen1107 dchen1107 added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed release-note-label-needed labels Dec 22, 2016
@k8s-ci-robot
Copy link
Contributor

Jenkins verification failed for commit 33e8dc1ee7b346ead1970cf8e48224833cad77ff. Full PR test history.

The magic incantation to run this job again is @k8s-bot verify test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

@bprashanth bprashanth added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Dec 22, 2016
@k8s-github-robot
Copy link

@k8s-bot test this [submit-queue is verifying that this PR is safe to merge]

@k8s-github-robot
Copy link

Automatic merge from submit-queue (batch tested with PRs 39114, 36004)

@k8s-github-robot k8s-github-robot merged commit 66152b9 into kubernetes:master Dec 22, 2016
saad-ali added a commit that referenced this pull request Jan 10, 2017
…836-#39114-#39059-upstream-release-1.5

Automated cherry pick of #38836 #39114 #39059 upstream release 1.5
jessfraz added a commit that referenced this pull request Jan 11, 2017
…836-#39114-#39059-upstream-release-1.4

Automated cherry pick of #38836 #39114 #39059 upstream release 1.4
@derekwaynecarr
Copy link
Member

This seems aggressive as it can elevate any pod with the annotation to almost guaranteed qos behavior. When we enforce memory limits at qos tier, it does give this pod even more precedent within a tier even if other pods have not used up to their scheduling request. This just gives way too much power imho. How are we restricting the abuse of this annotation? Can we at minimum restrict this annotations enforcement to pods in a particular namespace? And pods potentially whose namespace has been annotated as enabling this function by an operator? This has bad downstream security impacts for us in OpenShift and would ask we revisit this to reduce scope.

derekwaynecarr added a commit to derekwaynecarr/kubernetes that referenced this pull request Jan 17, 2017
@saad-ali saad-ali added the cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. label Jan 20, 2017
@saad-ali saad-ali removed cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. cherrypick-candidate labels Feb 9, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Memory pressure shouldn't hard evict static pods
7 participants