-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement CRI metrics collection #7796
Conversation
Hi @Mo-Fatah. Thanks for your PR. I'm waiting for a cri-o member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #7796 /- ##
==========================================
- Coverage 49.68% 49.63% -0.06%
==========================================
Files 153 153
Lines 16826 16884 58
==========================================
Hits 8360 8380 20
- Misses 7423 7459 36
- Partials 1043 1045 2 |
c1d5799
to
e8ac4da
Compare
/ok-to-test |
} | ||
|
||
func sandboxBaseLabelValues(sb *sandbox.Sandbox) []string { | ||
// TODO FIXME: image? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I wrote this TODO) What does cadvisor report as the image for the POD container?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to look it up but couldn't find what they are using exactly as image name for POD containers. They seem to use the image name from the container info (here and here but not sure how they differentiate between a container and a pod, and what image name exactly they are using for pods.
my suggestion would be using the sandbox name defined in the sandbox metadata (e.g. here)
great work here @Mo-Fatah ! I am leaning towards dropping the optimzations where we replace a slice in place. I think there will be an advantage to doing so in the future, but let's keep it simple to start. I also would love to see some unit and integration tests here to verify contents. If you don't have the bandwidth to add those let us know! |
/hold So I can do a proper review. |
d6d83cb
to
a83d366
Compare
@haircommander I simplified the slices updates and added an integration test for the memory metrics as a starter. the tests mainly reads the cgroup value and compare it to the value returned from |
@haircommander: Overrode contexts on behalf of haircommander: ci/prow/ci-e2e-evented-pleg In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/retest |
/retest |
- Add collection_period and included_pod_metrics to StatsConfig and shell completions Signed-off-by: Mohamed Abdelfatah <[email protected]> Co-authored-by: Sohan Kunkerkar <[email protected]>
Signed-off-by: Mohamed Abdelfatah <[email protected]>
Signed-off-by: Mohamed Abdelfatah <[email protected]>
Signed-off-by: Mohamed Abdelfatah <[email protected]> Co-authored-by: Sohan Kunkerkar <[email protected]>
- Update StatsServer to collect sandbox metrics in the update cycle Signed-off-by: Mohamed Abdelfatah <[email protected]> Co-authored-by: Sohan Kunkerkar <[email protected]>
- typo fix Signed-off-by: Mohamed Abdelfatah <[email protected]>
Signed-off-by: Mohamed Abdelfatah <[email protected]>
Signed-off-by: Mohamed Abdelfatah <[email protected]>
- Metrics tests improvement - Clean up Signed-off-by: Mohamed Abdelfatah <[email protected]>
- internal/stats: Use the internal logging package - Tiny refactoring for better error handling - lint, shfmt, shellcheck Signed-off-by: Mohamed Abdelfatah <[email protected]>
Signed-off-by: Mohamed Abdelfatah <[email protected]>
Signed-off-by: Mohamed Abdelfatah <[email protected]>
/lgtm 🎉 |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: haircommander, kwilczynski, Mo-Fatah, saschagrunert The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/retest |
1 similar comment
/retest |
/test e2e-gcp-ovn |
/retest |
/override ci/prow/e2e-gcp-ovn |
@haircommander: Overrode contexts on behalf of haircommander: ci/prow/e2e-gcp-ovn In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
What type of PR is this?
/kind api-change
/kind feature
What this PR does / why we need it:
Which issue(s) this PR fixes:
Related to https://issues.redhat.com/browse/OCPNODE-1008
Special notes for your reviewer:
This is a continuation of Sohan's effort in #7256 based on the recent cgroup manager changes in #7658 .
Does this PR introduce a user-facing change?