Skip to content

Conversation

@curdbecker
Copy link

@curdbecker curdbecker commented Dec 8, 2025

runc fails to run on Linux kernels without user-namespace support, since the missing setgroups file in procfs triggers a wrapped error and wrapped errors are unfortunately handled incorrectly.

More precisely, when user-namespace support is absent, the missing setgroups file is expected to trigger a ErrNotExist error that is then explicitly ignored. However, newer versions of pathrs introduce error wrapping which entirely hides the wrapped error without proper unwrapping on the receiving end. Therefore, the wrapped ErrNotExist error is treated instead as an unknown error and therefore runc fails to work entirely.

In addition, I also fixed unwrapping for the ErrPermission error in the same file for the sake of at least file-level consistency.
However, it seems like there are quite a lot of other references in runc that then also could require adjusting. Might be a future improvement to consider?

Just short off-topic how I found this:
I wanted to enjoy containers on an embedded cable router with a Linux kernel that needs to remain compatible with quite a lot of pre-built, proprietary, closed-source modules.
While I could eventually freely rebuild and replace the entire kernel, in practice, however these modules would have always become incompatible when trying to change any fundamental kernel data structures, e.g. while adding support for user namespaces.
Luckily, I managed to get network PID namespaces working while remaining compatible to the closed-source modules.... or the project likely would have been dead in the water from the start, since a cable router isn't really a cable router anymore when the cable modem module can't be loaded 😂

Fortunately, everything worked out and now I am quite happy with it.
And the PR maybe helps people in the future that are crazy enough to use containers on a very limited kernel configuration to get started easier :)

@curdbecker curdbecker force-pushed the fix/missing-error-unwrapping-in-init-container branch from a1103e9 to 0ec6bba Compare December 8, 2025 19:23
Copy link
Contributor

@kolyshkin kolyshkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The patch LGTM (indeed os.Is* functions only unwrap os errors, while errors.Is unwrap all errors.

The comment in the source seems excessive to me though; in general we should not use os.Is* functions.

@kolyshkin
Copy link
Contributor

in general we should not use os.Is* functions.

So, I guess, there might be other places in the code which no longer properly detect a particular error.

@curdbecker
Copy link
Author

@kolyshkin Thanks for looking at the PR and approving it 😊

Yeah, I agree the comments are a bit excessive, but I wanted to outline again the relation to what happens inside ‘pathrs’, because it took me quite a while to uncover what was going on - especially with the limited debugging capabilities on the embedded platform… and it looked so benign at the first, second and third glance - especially since my Go knowledge is rather limited.

Yeah, there are definitively other places. I remember doing a quick grep a couple of weeks ago and I found I think dozens of potential sources that could require unwrapping, but, well, I didn’t look further into it yet, since so far things have been running smoothly on my router.

@curdbecker
Copy link
Author

I can also remove the comments though if you’d like. Not particularly attached to them. Might just cite them here again in the conversation, so that it shows up in the search or something?

@cyphar
Copy link
Member

cyphar commented Dec 9, 2025

I agree that the comment is not necessary, it would be better to add a lint to block usage of os.Is* functions (we already have this for os.Create). Though that would require fixing all other usages...

@curdbecker curdbecker force-pushed the fix/missing-error-unwrapping-in-init-container branch from 0ec6bba to 7a95ae0 Compare December 9, 2025 02:31
@curdbecker
Copy link
Author

curdbecker commented Dec 9, 2025

@cyphar Sure, comment is gone. And the linter rule does sound good, so I gave it a try - let's see if I the regex syntax is the one I expected :D

However, I didn't match on os.Is*, because that would also include os.IsPathSeparator - not exactly our enemy right now :)

About fixing the other usages... I did the grep again. Doesn't look too bad. However, there are also usages in vendor packages. Does that matter? I think that's what prevented me from investigating further in the first place.

@curdbecker curdbecker force-pushed the fix/missing-error-unwrapping-in-init-container branch from 7a95ae0 to 959e454 Compare December 9, 2025 02:34
@curdbecker
Copy link
Author

Screenshot 2025-12-09 at 03 44 40

Looks not too bad honestly. grep-ing made it look so much worse with the vendor modules.

But it also discovered one reference in spec.go that the linter doesn't seem to care about. Might be a helper tool?

@curdbecker curdbecker force-pushed the fix/missing-error-unwrapping-in-init-container branch from 959e454 to 9830cbb Compare December 9, 2025 02:48
@curdbecker
Copy link
Author

Ok, so, I replaced now the remaining calls to os.Is* with their errors.Is counterpart as this seems fairly simple and more a mechanical task. Please give it a look.

It still looks quite okay-ish regarding the number of changes and if I understand the Go documentation correctly, it should be safe.

And I also realized my confusion regarding spec.go earlier, since I didn't know that the linter output only shows the first 7 errors... Now grep and the linter agree 😂

@curdbecker curdbecker changed the title Handle wrapped errors correctly in libcontainer/init_linux.go Handle os.Is* wrapped errors correctly Dec 9, 2025
Copy link
Contributor

@kolyshkin kolyshkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@curdbecker thanks for implementing all the changes. Now you need to move the linter patch to be the last one in series, as the order should be

  • fix the issue
  • add a linter rule to prevent its reoccurrence.

Other than that, still LGTM

Signed-off-by: Curd Becker <me@curd-becker.de>
@curdbecker curdbecker force-pushed the fix/missing-error-unwrapping-in-init-container branch from 19f9fc3 to 58d24d2 Compare December 11, 2025 02:17
@curdbecker
Copy link
Author

@kolyshkin Ah, yes, sorry, I did it the other way around, since I thought we could simply look at the CI results in order to find out how much of an impact we were talking about. But yeah, makes sense to re-order this. Should be done now :)

And I also squashed the two commits with the actual changes, because it doesn't make any sense to single libcontainer/init_linux.go out anymore when all occurrences should be fixed anyways.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants