-
Notifications
You must be signed in to change notification settings - Fork 252
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update copy.DirCopy to leave sockets in the file system #2117
Update copy.DirCopy to leave sockets in the file system #2117
Conversation
0a7988c
to
4c029de
Compare
If possible, I'd like to see this backported to 1.55 (before the shift to go 1.22) so that we could use it in Warewulf on EL 8 and 9. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At a first glance this makes no sense to me as proposed: if we don’t close that socket, what is going to close it?
This code needs to work in long-running processes like CRI-O, so leaking file descriptors is not acceptable.
(And, au contraire, if Go’s garbage collector somehow caused the socket to be closed eventually, then that means that the problematic chown/chmod operations could also fail, depending on GC timing.)
It might well be the case that we need to keep the socket open longer, and close it elsewhere; I’m not saying that the current code is correct, I didn’t look into that at the detail, at this point I just took the 1 minute to notice that the PR as currently proposed looks incorrect.
I thought that this was just a difference in semantics between a TCP socket and a Unix domain socket; but turns out Go just mixes close with unlink. What I really want is SetUnlinkOnClose(false). I'll fix it. Thanks for the review. |
4c029de
to
56a00aa
Compare
370a3f9
to
78c2a81
Compare
/approve |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not familiar with this code but why are we copying socket files in the first place? I don't see a how any user could benefit from a not connected socket file in the target.
So would it not be simpler to skip socket files all together?
I can’t speak to this: before I started investigating I expected the solution would be to skip sockets as well. But when I found code that was explicitly trying to copy them, but just failing to do so properly, I decided I would start by making the existing code work, rather than remove it, since I didn’t know the original intent. in theory, I suppose the existence of the socket could be a sentinel to some other application. I also think the existing socket can be re-opened by later applications at the existing path. |
Maybe but this seems rather unlikely
I do not see how this could work, binding an existing socket file will fail with EADDRINUSE even when there is no other program actively listening there. As such most programs do an explicit unlink() before calling bind(). Given the current version deletes the path again (and then fails to chown) it should be safe to assume that no user can depend on the behavior and I personally see skipping sockets to be much simpler than doing this here. |
I think this was done without thinking too much about its implications, so let's ignore them. Should we do the same with FIFOs? |
In Docker, this exists all the way back in moby/moby#7619 . Note the existence of Line 45 in 4bf3f07
I don’t know that the files would be useful, but if a pull+push changed the filesystem contents, that might be unexpected. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Let’s decide whether we need these files at all, first.)
I did take a quick look what gnu relevant source code: https://github.com/coreutils/coreutils/blob/e71b24cedc2ff640f94911298e53b24506f13062/src/copy.c#L3090-L3099 Thus I would say we should prefer a mknod() call here over the current fix. Though this still doesn't answer if we have to keep socket files during copy.
I guess if this is a practical concern then keeping them makes sense. |
AFAICS a tarball cannot handle unix sockets (https://pkg.go.dev/archive/tar#pkg-constants), so that shouldn't be a concern.
Using |
Thanks, TIL. |
@anderbubble what is going on with this PR? |
@rhatdan I'll update it based on the conversation here; I just got pulled into some other things for a bit. |
dc8c8bf
to
fbf5ab0
Compare
- Closes containers#2113 `copy.DirCopy` contains code to create named sockets matching those on the source; however, this operation closes the socket, removing it implicitly from the destination and causing later chown and chmod operations to fail. This commit creates matching file system nodes directly without opening a socket or invoking the auto-removal semantics of a listener. Signed-off-by: Jonathon Anderson <[email protected]>
fbf5ab0
to
6e4298e
Compare
@rhatdan I updated the pr to use mknod rather than opening a dangling socket. I think it's good to go now; but let me know if you have any questions or concerns. Sorry for the delay! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: anderbubble, giuseppe, rhatdan The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/lgtm |
Will this be automatically backported to 1.55, or do I need to file a separate PR for that? |
it needs to be backported |
Update copy.DirCopy to leave sockets in the file system (cherry picked from commit 960e8dd)
Backport PR at #2159. Let me know if that needs any adjustment. |
Backport pull request #2117 to release-1.55
copy.DirCopy
contains code to create named sockets matching those on the source; however, this operation closes the socket, removing it implicitly from the destination and causing later chown and chmod operations to fail.This commit creates matching file system nodes directly without opening a socket or invoking the auto-removal semantics of a listener.