-
Notifications
You must be signed in to change notification settings - Fork 31
enable ai-services to run as non-root-user #267
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
3a07de7 to
eea51b8
Compare
ff36007 to
13f1f24
Compare
| return fmt.Errorf("current user is not root (EUID: %d)", euid) | ||
| if euid != 0 && os.Getenv("XDG_RUNTIME_DIR") == "" { | ||
| uid := os.Getuid() | ||
| logger.Infoln("running command as %s", uid, logger.VerbosityLevelDebug) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to set the userID in XDG_RUNTIME_DIR?. Also a query on why do we need to set this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From what I checked online, they say not to explicitly set this value tho (I dont know much on this :))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Usually it is automatically set, but I noticed in one LPAR where it was not set and that is why we configure it if not already done
XDG_RUNTIME_DIRpoints to the user specific runtime files.systemctl --userneeds to connect to the user's systemd instance and without this variable, systemctl doesn't know where to find the socket and hence we get "No medium found" error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does setting userID in XDG_RUNTIMR_DIR has any implications?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay, looks like its not needed if we are enabling linger, that is, loginctl enable-linger user, which is required and will be mentioned in the documentation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will remove this from the code and push the changes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wait, my bad, while creating the application we get failed to pull image registry.redhat.io/rhaiis/vllm-spyre-rhel9:3.2.5: reading JSON file "/run/containers/1001/auth.json": open /run/containers/1001/auth.json: permission denied because it is not able to lookup the directory.
So yes, this is required.
Apologies for creating the confusion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok
| func userPodmanURI(uid int) string { | ||
| return fmt.Sprintf( | ||
| "unix:///run/user/%d/podman/podman.sock", | ||
| uid, | ||
| ) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a check. Did all the pods deployment worked fine with non root?
Did you test by deploying the RAG application?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried this on a 3 spyre card setup, where all pods were in healthy state except vllm as it required 4 cards for instruct and 1 for reranker. I configured instruct to use 2 cards and hence it was unhealthy
| cmd := exec.Command("sudo", "ppc64_cpu", "--smt") | ||
| out, err := cmd.CombinedOutput() | ||
| if err != nil { | ||
| return fmt.Errorf("failed to check current SMT level: %v, output: %s", err, string(out)) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we are using sudo, will this prompt for the password to be entered by user here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if passwordless user is enabled, then no
else for all sudo operations, it will ask for password
but I think here it might fail, as exec.Command doesn't attach to terminal stdin by default afaik. Let me confirm and post results here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but I think here it might fail, as exec.Command doesn't attach to terminal stdin by default afaik.
confirmed this, I have disabled the spinner for smt level, since we might need user input for this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to enable sudo user without password so that there wont be any user intervention needed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dont think we should do that
Its upto admin on how the user is setup, passwordless or with password
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So now create command also requires SUDO_USER right because of setting smt level?.
Wouldnt this defeat the purpose of what we wanted to achieve if I'm correct.
We wanted bootstrap to be run by either root or sudo user and the create by a normal user right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
User will anyway have permissions to run as sudo, and since moving it to configure.go cannot be possible as we want to keep this smt value configurable as per application, thats why thought of adding sudo.
CC: @mkumatag
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me know if there is any other alternative
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure. Maybe we can have a default constant value set I guess during bootstrap.
For now, I guess we need to document that only root user and non root users with sudo privileges can run.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, SMT will set per application template need
5d7ed99 to
9be37ea
Compare
Signed-off-by: sagarwal-ibm <sagarwal@ibm.com>
9b6d840 to
61702bb
Compare
mkumatag
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
initial comments, overall flow looks really complicated, might need some more time to go through this code and the description.
| cmd := exec.Command("sudo", "ppc64_cpu", "--smt") | ||
| out, err := cmd.CombinedOutput() | ||
| if err != nil { | ||
| return fmt.Errorf("failed to check current SMT level: %v, output: %s", err, string(out)) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, SMT will set per application template need
61702bb to
00741bf
Compare
|
please squash and merge the changes post @mayuka-c's approval. |
|
@Shubhamag12 is currently testing it. Once done we can merge |
Context: https://ibm.ent.box.com/file/2102027246213
Description
Problem Statement
Currently, ai-services only supports execution as the root user.
Solution
This PR enables ai-services to run as a non-root user. The tool now supports a hybrid approach:
sudofor one-time system setupUser Setup Requirements
To use ai-services as a non-root user, the following one-time setup is required:
1. Add user to sudo group (wheel)
This grants the user permission to execute specific commands with sudo when needed.
2. Enable systemd user session persistence
This ensures the user's systemd instance persists even when not logged in, which is required for:
Implementation Details
Rootless Podman
/run/user/<UID>/podman/podman.sock(user-specific, not system-wide)Directory Permissions
$SUDO_USEREnvironment Variables
XDG_RUNTIME_DIR: Automatically set if missing (required for user systemd operations)SUDO_USER: Used during bootstrap to determine the actual user for ownership settingsQ&A
Q: Why do we need to add the non-root user to the sudo group?
A: We still need certain commands to be executed with sudo, like
ai-services bootstrap, as it configures and modifies the env which requires root permissions.Q: Which commands would be run without sudo?
A: Except
bootstrap, every other command would be run without sudo.