-
Notifications
You must be signed in to change notification settings - Fork 2.4k
-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
race: completion: cannot get cgroup path unless container is running #23334
Comments
Reproducer
terminal 2
Likely another case of must ignore the "container is stopped" error when looping over all containers. |
stats read from the cgroup, and in order to know the cgroup we check the pid for the cgroup. However there is a window where the pid exited and podman did not yet updated its internal state. In this case the code returns ErrCtrStopped so we should ignore this error as well. Fixes containers#23334 Signed-off-by: Paul Holzinger <pholzing@redhat.com>
The reproducer failed quickly for me, with my PR it looks stable now (running for 15 mins) |
stats read from the cgroup, and in order to know the cgroup we check the pid for the cgroup. However there is a window where the pid exited and podman did not yet updated its internal state. In this case the code returns ErrCtrStopped so we should ignore this error as well. Fixes containers#23334 Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Sorry, still happening |
Oh sorry, I was looking at podman stats all this time, podman pod stats works slightly different and doesn't make use of the "all" option which means the error is not ignored there. I still fixed a valid issue with podman stats at least but pod stats uses a different code path... |
Like commit 55749af but for podman *pod* stats not the normal podman stats. We must ignore ErrCtrStopped here as well as this will happen when the container process exited. Fixes containers#23334 Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Like commit 55749af but for podman *pod* stats not the normal podman stats. We must ignore ErrCtrStopped here as well as this will happen when the container process exited. While at it remove a useless argument from the function as it was always nil and restructure the logic flow to make it easier to read. Fixes containers#23334 Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Like commit 55749af but for podman *pod* stats not the normal podman stats. We must ignore ErrCtrStopped here as well as this will happen when the container process exited. While at it remove a useless argument from the function as it was always nil and restructure the logic flow to make it easier to read. Fixes containers#23334 Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Another parallel-system-test flake, seen in f39 root:
The text was updated successfully, but these errors were encountered: