-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Fix runc exec vs go1.26 + older kernel #5091
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Note that copying of |
|
@kolyshkin so let me see if I'm following correctly. The proposal to add
Am I getting it right? In that case, it seems fine, but it might break again in 1.27 (as they implied this will be removed if |
@lifubang this case is already (implicitly) covered by the existing tests. For example, The failing test part is here: runc/tests/integration/cgroups.bats Lines 109 to 112 in 08072e9
The failure itself (before the commit with the fix) is here: https://github.com/opencontainers/runc/actions/runs/21379582395/job/61543423814 This is the reason why I'm adding Go 1.26 to the CI matrix. I am working on simplifying the test from #5066 (in its current form it's too complicated) so maybe I will add something here, but in general it's not needed. |
You are right, it may stop working in case they will use something in exec.Command that could/should not be copied. Let's hope they are to add the Clone method in this case, and we'll just switch to it (depending on Go version): //go:build go1.27
func cloneCmd(cmd *exec.Cmd) *exec.Cmd {
return cmd.Clone()
}
//go:build !go1.27
func cloneCmd(cmd *exec.Cmd) *exec.Cmd {
cmdCopy = *cmd
return &cmdCopy
}and then, eventually, remove the cloneCmd wrapper and use cmd.Clone directly. |
thaJeztah
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
change LGTM
are new tests needed? (I see the other PR added some tests, but I haven't looked if they're good)
|
Oh, lol, missed the discussion (thanks browser refresh 😆) |
This is mostly to test whether https://go.dev/cl/728642 results in any test failures in the current CI matrix. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Yes, you are right.
I think it's not necessary to include another unit test, from my perspective, you can remove the last commit. |
@kolyshkin I'd still prefer to have the pre go 1.27 The only potential downside is if more fields are added in 1.26 and we start using them and we forget to update this copy. It seems unlikely, I don't think 1.26 will add fields. This is short lived (ideally, until 1.27 comes out), then we can remove it. So I don't think this risk (that is the only I see) is a concern. What do you think? |
@rata done, PTAL |
|
A few things to note here:
|
Theoretically, exec.Command can set cmd.Err. Practically, this should never happen (Linux, Go <= 1.26, exePath is absolute), but in the unlikely case it does, let's fail early. This is related to the cloneCmd (to be introduced by the following commit) which chooses to not copy the Err field. Theoretically, exec.Command can set Err and so the first call to cmd.Start will fail (since Err != nil), and the second call to cmd.Start may succeed because Err == nil. Yet, this scenario is highly unlikely, but better be safe than sorry. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Since [PR 4812], runc exec tries to use clone3 syscall with CLONE_INTO_CGROUP, falling back to the old method if it is not supported. One issue with that approach is, a > Cmd cannot be reused after calling its [Cmd.Start], [Cmd.Run], > [Cmd.Output], or [Cmd.CombinedOutput] methods. (from https://pkg.go.dev/os/exec#Cmd). This is enforced since Go 1.26, see [CL 728642], and so runc exec actually fails in specific scenarios (go1.26 and no CLONE_INTO_CGROUP support). The easiest workaround is to pre-copy the p.cmd structure (copy = *cmd). From the [CL 734200] it looks like it is an acceptable way, but it might break in the future as it also copies the private fields, so let's do a proper field-by-field copy. If the upstream will add cmd.Clone method, we will switch to it. Also, we can probably be fine with a post-copy (once the first Start has failed), but let's be conservative here and do a pre-copy. [PR 4812]: opencontainers#4812 [CL 728642]: https://go.dev/cl/728642 [CL 734200]: https://go.dev/cl/734200 Reported-by: Efim Verzakov <efimverzakov@gmail.com> Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
|
The first commit should be a draft until Go 1.26 is released, but we can patch this up later. Or, I can remove the first commit for now, for a cleaner git history. |
This is an alternative to #5066, fixing runc exec vs go1.26 issue with 1.4.0.
Go 1.26.0 is to be released soon (in February).
Fixes: #5060
Closes: #5066
Since PR 4812, runc exec tries to use
clone3syscall withCLONE_INTO_CGROUP, falling back to the old method if it is notsupported.
One issue with that approach is, a
(from https://pkg.go.dev/os/exec#Cmd).
This is enforced since Go 1.26, see CL 728642, and so runc exec
actually fails in specific scenarios (go1.26 and no CLONE_INTO_CGROUP
support).
The easiest workaround is to pre-copy the p.cmd structure. From the
CL 734200 it looks like it is an acceptable way. If the upstream
will introduce cmd.Clone we'll switch to it.
As for the test case, it looks like adding Go 1.26 to the testing
matrix is sufficient to reveal the issue (some tests involving
runc execfail with Go 1.26 before the fix).