-
Notifications
You must be signed in to change notification settings - Fork 39
feat: Mac port #183 #300
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
feat: Mac port #183 #300
Conversation
|
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
possibly the same in BSD
basically the same according to manual
File system watchers (inotify, FSEvents) report canonical paths, but ConfigWatcher was storing paths as provided by the caller. When a path contained symlinks (e.g., /var -> /private/var on macOS), event paths wouldn't match stored paths, causing config changes to be missed. Add canonicalize_path() to resolve symlinks in the existing portion of a path (handling paths where the final components don't exist yet), and apply it when storing paths in the watcher. Include a regression test that creates a symlink and verifies events are received when watching through the symlink.
The worker_ready() test helper signals when the worker is idle, but it was signaling even when there was a pending reload_deadline. This could cause tests to proceed before the debounce timeout fired, leading to flaky test results. Only signal idle when there's no pending work (reload_deadline is None). Update the debounce test to not call worker_ready() between writes, as that now correctly waits for the reload to complete.
This version includes macOS support via ptsname (vs ptsname_r). See: shell-pool/shpool_pty#4
- Forward test_hooks feature from shpool to libshpool crate - Shorten socket paths to avoid macOS 104-byte sun_path limit - Forward HOME and PATH env vars to attach subprocess (macOS typically lacks XDG_RUNTIME_DIR, needs HOME instead) - Use std::env::var directly to avoid unused import warning - Document that tests should run with --test-threads=1 on macOS due to FSEvents timing behavior under concurrent load
macOS returns EINVAL when setting socket timeout on a connection where the peer has already closed (documented in setsockopt(2)). This typically happens with daemon presence probe connections.
| // closed (e.g., a daemon presence probe). This is documented in the | ||
| // macOS setsockopt(2) man page. Treat this the same as a broken pipe. | ||
| #[cfg(target_os = "macos")] | ||
| if let Err(e) = stream.set_read_timeout(Some(consts::SOCK_STREAM_TIMEOUT)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This block could be simplified by removing the #[cfg(not(target_os = "macos"))] block below and moving the #[cfg(target_os = "macos")] block to only be attached to the if statement.
In fact, since it looks like on linux EINVAL is only returned when the input is malformed, and we're using a safe wrapper that should never malform input, we could probably get away with not even including the cfg annotation. It's probably better to keep it just to be explicit though, since this would do the wrong thing on linux if we somehow did get EINVAL.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I originally had the EINVAL check as something which ran on all platforms but I was being particularly careful not to affect other platforms with potentially BSD 4.4/macOS specific stuff, mainly because I couldn't really find a definitive answer on what conditions we would see EINVAL on other platforms (even the macOS man pages leave a lot unanswered questions for me).
I'll take another swing at this section.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was being particularly careful not to affect other platforms with potentially BSD 4.4/macOS specific stuff
Yeah, probably a good idea. Thanks for being careful about that kind of thing.
ethanpailes
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks pretty good to me. The main merge blockers are the lack of safety comments and the fact that I still have to patch this in on my macbook and run the tests.
We should also consider adding support for running the tests under macos in CI. I'm not sure how hard that will be to set up. I think it would be fair to punt on doing that until a followup.
|
|
||
| let mut peer_uid: libc::uid_t = 0; | ||
| let mut peer_gid: libc::gid_t = 0; | ||
| unsafe { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a // Safety: comment that explains why this unsafe block is safe. In this case, we probably just need to say that "sock" is a valid fd, and peer_{uid,gid} are valid references to the stack for the duration of the call.
|
|
||
| let mut peer_pid: libc::pid_t = 0; | ||
| let mut len = std::mem::size_of::<libc::pid_t>() as libc::socklen_t; | ||
| unsafe { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same thing about a // Safety comment.
|
I just patched this branch in on my macbook and ran the tests, including the single-threading flag. Unfortunately I saw a bunch of errors. @c22 is this expected? I would be ok with merging even with these errors since it's an improvement over the status quo, but we'll need to add something to the README explaining that while shpool builds on macos it's not yet an officially supported platform since the test suite doesn't pass yet. |
Ah yeah sorry, I should have been more explicit about that. A bunch of integration tests failed for me as well and bounce is still probably a bit flaky on macOS even in single threaded testing. I will diff against my side later today and see if anything stands out between your results. |
Got it. Where you thinking of trying to fix them all before merge or do you want to just get something merged and then fix the remaining tests on mac in followups? Like I said, since all the linux tests are passing I'm happy to merge even with some tests broken on mac. |
|
I checked my end and the exact same tests were failing on my machine as well, so yes, that was expected.
I was thinking of leaving them failed for now as they were also failing/skipped on seruman's builds. With some quick-fixes and workarounds, I was able to get 8 more tests to pass (2 are related to hardcoded fish/zsh paths, 6 are related to aggressive timing, eg. 500ms on Linux -> 2500ms on macOS). 4 more tests are MOTD pager related, I didn't look into them too deeply. 5 of the tests are re-attachment failures, these are a bit more concerning to me, because re-attachment is probably a key part of my use-case on Mac. Fixing that locally required changing the way I'm happy for you guys to take this as-is if you like (with the unsafe comments added), but if you're receptive to another fix, I can incorporate it as part of this change as well, I just probably won't get to it until after the weekend. |
This work is based on the previous work by @seruman in their fork https://github.com/seruman/shpool/tree/macos
I tried to retain as much of the existing work done and base my work on top of that.
seruman's fork had socket timeouts disabled on macOS. I wasn't able to reproduce socket timeouts not working on macOS but I was seeing unhandled EINVAL errors on daemon presence probe connections (where the peer closes immediately). These are now handled gracefully, so I have dropped the changes that disabled socket timeouts.
I also ran into a bug where ConfigWatcher was not using canonicalized paths, but inotify et. al. use canonicalized paths. This likely wasn't being ran into much on Linux environments but becomes very obvious on macOS because /tmp is a symlink.
Timing on macOS is indeed tricky, the fix would involve probably significantly re-working the way test are being done, but the workaround I found was to run them in serial.
I think the rest of the changes were relatively minor.
Here's a summary of the changes:
and the cherry-picked commits from https://github.com/seruman/shpool/tree/macos