Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add namespace and name into the CRI pod log directory #74441

Merged
merged 2 commits into from
Mar 9, 2019

Conversation

Random-Liu
Copy link
Member

@Random-Liu Random-Liu commented Feb 22, 2019

For #73503.

This PR:

  • Changed pod log directory from /var/log/pods/UID to /var/log/pods/NAMESPACE_NAME_UID.

    • I tried the solution proposed in CRI: Include pod name and namespace in the CRI pod log directory #73503 (comment). It is too complex to implement and maintain. Since we never officially support in-place upgrade, let's only do best effort support.
    • We made several changes for backward compatibility:
      • Change 1: Whenever the container log path is needed, ContainerStatus.GetLogPath() is used to get the actual log path from the container runtime. This works for both containers with old log path and new log path.
      • Change 2: Pod log directory garbage collection handles both old log path and new log path.
    • With these changes, only minor problems left:
      • When getting container log stats, only new container log path is being used, because we want to avoid extra ContainerStatus calls. So if people do in-place upgrade, the log stats for existing containers will become empty.
      • For a newly created container in an existing pod sandbox, if the container runtime uses pod log directory passed in RunPodSandbox instead of CreateContainerRequest, the container log will be in the old pod log directory. In this case:
        • The symlink from /var/log/containers to /var/log/pods/NAMESPACE_NAME_UID won't work, because the container log is actually in /var/log/pods/UID.
        • The container log stats will be empty.
        • To handle this case properly, it is recommended for the container runtime to use the newest SandboxConfig in the CreateContainerRequest instead of caching and using the one passed in RunPodSandbox. Dockershim is handling this correctly, and we'll change containerd to do this as well. /cc @mrunalp
  • Include container log inode usage as part of pod ephemeral storage for CRI stats provider. We were not doing that because cadvisor doesn't return accurate container log inode usage, but we do have that for CRI container runtimes. /cc @dashpole @jingxu97

  • Include fsstats of log files under the pod log directory. Because we don't have log files under the pod log directory today, so this won't affect any existing functionality. Some sandbox runtimes, e.g. kata and gvisor, may have some pod level logs. This change makes it possible them to fsstats of those logs as part of pod ephemeral storage.

/cc @kubernetes/sig-node-pr-reviews

Change CRI pod log directory from `/var/log/pods/UID` to `/var/log/pods/NAMESPACE_NAME_UID`.

It is recommended to drain the node before upgrade, or reboot the node after upgrade.

@Random-Liu Random-Liu added priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. kind/feature Categorizes issue or PR as related to a new feature. labels Feb 22, 2019
@Random-Liu Random-Liu added this to the v1.14 milestone Feb 22, 2019
@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Feb 22, 2019
@Random-Liu Random-Liu added kind/bug Categorizes issue or PR as related to a bug. sig/node Categorizes an issue or PR as relevant to SIG Node. labels Feb 22, 2019
@k8s-ci-robot k8s-ci-robot added area/kubelet approved Indicates a PR has been approved by an approver from all required OWNERS files. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Feb 22, 2019
@k8s-ci-robot k8s-ci-robot added the sig/testing Categorizes an issue or PR as relevant to SIG Testing. label Feb 23, 2019
@Random-Liu
Copy link
Member Author

/test pull-kubernetes-integration

@k8s-ci-robot k8s-ci-robot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. sig/gcp and removed approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Mar 1, 2019
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 1, 2019
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Random-Liu

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added area/test and removed lgtm "Looks good to me", indicates that a PR is ready to be merged. labels Mar 9, 2019
@Random-Liu
Copy link
Member Author

Just rebased. Reapply LGTM based on #74441 (comment)

@Random-Liu Random-Liu added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 9, 2019
@k8s-ci-robot k8s-ci-robot merged commit b150560 into kubernetes:master Mar 9, 2019
@zoetrope zoetrope mentioned this pull request Apr 8, 2019
3 tasks
@soltysh
Copy link
Contributor

soltysh commented May 17, 2019

* Changed pod log directory from `/var/log/pods/UID` to `/var/log/pods/NAMESPACE_NAME_UID`.

This introduced backwards incompatible change, given the limit on names (253 coming from

const DNS1123SubdomainMaxLength int = 253
) which applies both to namespace and name we end up with filename significantly exceeding 255 which is a fact in majority of file systems (https://en.wikipedia.org/wiki/Comparison_of_file_systems#Limits). /cc @smarterclayton @bgrant0607 @thockin
The fact this wasn't caught earlier is we're missing a test creating a pod with maximum allowed name.

tredman added a commit to honeycombio/honeycomb-kubernetes-agent that referenced this pull request Jun 12, 2019
Our agent is broken on K8S cluster versions published after this change: kubernetes/kubernetes#74441

Adding yet another log path bandaid to the list of known patterns. It works, but I'd like to find a way to stop doing this. Unfortunately the current K8S API doesn't provide a way to get the log *path* for a pod, just the log. We could start streaming k8s logs from the API, but that's a significant amount of work and we don't have time to do that right now. Here's hoping the log pattern doesn't change again in the next 6 months.
mollystamos123 pushed a commit to honeycombio/honeycomb-kubernetes-agent that referenced this pull request Jun 21, 2019
Our agent is broken on K8S cluster versions published after this change: kubernetes/kubernetes#74441

Adding yet another log path bandaid to the list of known patterns. It works, but I'd like to find a way to stop doing this. Unfortunately the current K8S API doesn't provide a way to get the log *path* for a pod, just the log. We could start streaming k8s logs from the API, but that's a significant amount of work and we don't have time to do that right now. Here's hoping the log pattern doesn't change again in the next 6 months.
@Random-Liu Random-Liu deleted the pod-log-directory branch August 16, 2019 02:07
haircommander added a commit to haircommander/kubernetes that referenced this pull request Feb 15, 2022
in kubernetes#74441,
the namespace and name were added to the pod log location.

However, cAdvisor stats provider wasn't correspondingly updated.

since CRI-O uses cAdvisor stats provider by default, despite being a CRI implementation,
eviction with ephemeral storage and container logs doesn't work as expected, until now!

Signed-off-by: Peter Hunt <[email protected]>
muyangren2 pushed a commit to muyangren2/kubernetes that referenced this pull request Jul 14, 2022
in kubernetes#74441,
the namespace and name were added to the pod log location.

However, cAdvisor stats provider wasn't correspondingly updated.

since CRI-O uses cAdvisor stats provider by default, despite being a CRI implementation,
eviction with ephemeral storage and container logs doesn't work as expected, until now!

Signed-off-by: Peter Hunt <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/kubelet area/test cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants