[FEATURE] remote policy with server/client #310

kanghui0204 · 2025-12-30T09:30:30Z

Hi Team,

I added a remote server/client mechanism for IsaacLab-Arena that allows the IsaacLab environment and the policy model environment to run separately. Alex and I wrote a design document(design doc) for this feature, which includes our discussions about the design. Below is an overview of this PR.

1. Functionality

Previously, the IsaacLab-Arena pipeline ran entirely in a single process:

IsaacLab env → Env policy class (e.g., Gr00tClosedloopPolicy) → local policy model (e.g., Gr00tPolicy), all in one environment.

This PR adds remote policy/server–client support and decouples the env policy class from the local policy model. The new pipeline becomes:

Client: IsaacLab env → Env policy class (e.g., ActionChunkingClientPolicy)

Server: VLA/VLN policy model (e.g., Gr00tRemoteServerSidePolicy)

The client and server exchange observations and actions via sockets.

The server/client implementation lives in isaaclab_arena/remote_policy inside IsaacLab-Arena. Users can copy this directory and import it on the server side directly, without needing to install it as a separate package.

2. How to use it

On the client side, you still use isaaclab_arena/examples/policy_runner.py, but you now pass a few extra arguments. Example:

/isaac-sim/python.sh isaaclab_arena/examples/policy_runner.py \
  --policy_type isaaclab_arena.policy.action_chunking_client.ActionChunkingClientSidePolicy \
  --remote_host 127.0.0.1 \
  --remote_port 5555 \
  --remote_api_token API_TOKEN_123 \
  --remote_timeout_ms 5000 \
  --num_steps 2000 \
  --num_envs 10 \
  --enable_cameras \
  --headless \
  --remote_kill_on_exit \
  gr1_open_microwave \
  --embodiment gr1_joint

On the server side, there is a Python entry point isaaclab_arena.remote_policy.remote_policy_server_runner.py. You specify host/port/etc., select the policy type, and provide the policy config file:

python isaaclab_arena/remote_policy/remote_policy_server_runner.py
   --host 0.0.0.0 
   --port 5555
   --api_token API_TOKEN_123
   --timeout_ms 5000
   --policy_type isaaclab_arena_gr00t.policy.gr00t_remote_policy.Gr00tRemoteServerSidePolicy 
   --policy_config_yaml_path /workspace/isaaclab_arena_gr00t/policy/config/gr1_manip_gr00t_closedloop_config.yaml

3. Current example

Right now there is a working example for GR00T. On a given machine, the steps are:

Start the server

 bash docker/run_gr00t_server.sh   -d /home/scratch.hkang_sw/Isaac/dataset/models   --host 0.0.0.0   --port 5555   --api_token API_TOKEN_123   --timeout_ms 5000   --policy_type isaaclab_arena_gr00t.policy.gr00t_remote_policy.Gr00tRemoteServerSidePolicy   --policy_config_yaml_path /workspace/isaaclab_arena_gr00t/policy/config/gr1_manip_gr00t_closedloop_config.yaml

This uses the following defaults inside the script:

host: 0.0.0.0
port: 5555
api_token: API_TOKEN_123
timeout_ms: 5000
policy_type: isaaclab_arena_gr00t.policy.gr00t_remote_policy.Gr00tRemoteServerSidePolicy
policy_config_yaml_path: /workspace/isaaclab_arena_gr00t/gr1_manip_gr00t_closedloop_config.yaml

Start the client
Set up the IsaacLab Docker container following
IsaacLab Arena docker documentation(no GR00T installation is needed inside this container).

Inside the container, run:

/isaac-sim/python.sh isaaclab_arena/examples/policy_runner.py \
  --policy_type isaaclab_arena.policy.action_chunking_client.ActionChunkingClientSidePolicy \
  --remote_host 127.0.0.1 \
  --remote_port 5555 \
  --remote_api_token API_TOKEN_123 \
  --remote_timeout_ms 5000 \
  --num_steps 2000 \
  --num_envs 10 \
  --enable_cameras \
  --headless \
  --remote_kill_on_exit \
  gr1_open_microwave \
  --embodiment gr1_joint

With this setup, you can obtain results for the GR1 open microwave task using a remote GR00T policy server.
and I can get the result:
{'success_rate': 0.58, 'revolute_joint_moved_rate': 0.95, 'num_episodes': 200}

4. What remains to be done? Future work

At the moment, documentation for this feature has not been updated. I’d like the team to review this PR first and confirm that the usage and interface look reasonable.

Planned future work includes:

Support for the new VLN task.
Improved communication efficiency. Currently, the server and client communicate over sockets. We can explore more efficient transports and/or observation compression to reduce communication latency.

alexmillane

Thank you for putting this together.

Most of my comments are minor syntactic things that we can easily address.

I have one major comment on the design. At the moment, the server-client split point leaves half of gr00t policy running on the client and half (the model itself) running on the server. To me it would make sense to move the split point such that the whole policy is launched on the server. That way, in the remote inference case, we remove all gr00t details from the client side. It also simplifies the server side - the server just launches the same policy (called Gr00tClosedloopPolicy) that is launched in the local case.

This diagram explains (a simplified version) of the current design and the alternative that I'm proposing.

alexmillane · 2026-01-05T01:35:02Z

docker/setup/install_gr00t_deps_wo_isaac.sh

+#!/bin/bash
+set -euo pipefail
+
+# Script to install GR00T policy dependencies
+# This script is called from the GR00T server Dockerfile
+
+: "${GROOT_DEPS_GROUP:=base}"
+: "${WORKDIR:=/workspace}"
+
+echo "Installing GR00T with dependency group: $GROOT_DEPS_GROUP"
+
+# CUDA environment variables for GR00T installation.
+# In the PyTorch base image, CUDA is already configured, so we only
+# set variables if CUDA_HOME exists.
+if [ -d "/usr/local/cuda" ]; then
+    export CUDA_HOME=${CUDA_HOME:-/usr/local/cuda}
+    export PATH=${CUDA_HOME}/bin:${PATH}
+    export LD_LIBRARY_PATH=${CUDA_HOME}/lib64:${LD_LIBRARY_PATH:-}
+fi
+
+echo "CUDA environment variables:"
+echo "CUDA_HOME=${CUDA_HOME:-unset}"
+echo "PATH=$PATH"
+echo "LD_LIBRARY_PATH=${LD_LIBRARY_PATH:-unset}"
+
+# Install system-level media libraries (no sudo in container)
+echo "Installing system-level media libraries..."
+apt-get update && apt-get install -y ffmpeg && rm -rf /var/lib/apt/lists/*
+
+# Upgrade packaging tools
+echo "Upgrading packaging tools..."
+python -m pip install --upgrade setuptools packaging wheel
+
+# Install Isaac-GR00T with the specified dependency group
+echo "Installing Isaac-GR00T with dependency group: $GROOT_DEPS_GROUP"
+python -m pip install --no-build-isolation --use-pep517 \
+    -e "${WORKDIR}/submodules/Isaac-GR00T/[$GROOT_DEPS_GROUP]"
+
+# Install flash-attn (optional, keep same version as Arena Dockerfile)
+echo "Installing flash-attn..."
+python -m pip install --no-build-isolation --use-pep517 flash-attn==2.7.1.post4 || \
+    echo "flash-attn install failed, continue without it"
+
+echo "GR00T dependencies installation completed successfully"


Thanks for putting this together.

This script seems to differ only very slightly from the install_gr00t_deps.sh. Is the only difference the python command used? I.e. python vs /isaac-sim/python.sh.

Could we combine these two scripts into a single script that takes an argument(s)?

I remove this to install_gr00t_deps.sh and add a arguement on install_gr00t_deps.s

alexmillane · 2026-01-05T01:45:16Z

docker/run_gr00t_server.sh

+# REQUIRED: host models directory (must be set by user)
+if [[ -z "${MODELS_DIR:-}" ]]; then
+  echo "ERROR: MODELS_DIR is not set."
+  echo "Please export MODELS_DIR to your host models directory, e.g.:"
+  echo "  export MODELS_DIR=/path/to/your/models"
+  echo "Then run:"
+  echo "  bash ./docker/run_gr00t_server.sh"
+  exit 1
+fi


Suggestion to used the same thing used in run_docker.sh:

Models are by default expected on the host at $HOME/models but this can be changed with the -d flag to the script.

See default and the optional override

please check again ，I remove this part and add similar code as run_docker.sh

alexmillane · 2026-01-05T01:54:03Z

docker/run_gr00t_server.sh

+  --timeout_ms "${TIMEOUT_MS}" \
+  --policy_type "${POLICY_TYPE}" \
+  --policy_config_yaml_path "${POLICY_CONFIG_YAML_PATH}"
+


Not strictly related to this MR, but I'm wondering if we should move all the gr00t related stuff. I.e. all the gr00t docker stuff and the gr00t related tests out of the core framework and into the isaaclab_arena_gr00t package. Actually this is basically certainly a good idea. Ideally the core framework would make no mention of gr00t.

It would probably make sense to do this before merging this MR. So we can get everything gr00t related in the right package, before expanding what we can do there.

I have now made the change I suggested above. All gr00t-related code lives in isaaclab_arena_gr00t

alexmillane · 2026-01-05T02:03:18Z

isaaclab_arena/examples/policy_runner_cli.py

            num_steps = policy.get_trajectory_length(policy.get_trajectory_index())

    elif args.policy_type == "gr00t_closedloop":
+        from pathlib import Path


Suggestion to move to the top of the file.

We have some imports not at the top of the file if they require special dependencies that are conditionally not required. But prefer to put imports at the top.

alexmillane · 2026-01-05T02:05:20Z

isaaclab_arena/examples/policy_runner_cli.py

+    remote_group.add_argument(
+        "--remote_api_token",
+        type=str,
+        default=None,
+        help="Optional API token for remote policy server.",
+    )


Why is an API token (optionally) required? Suggestion to expand the help string here.

I add a detail description ，please check

alexmillane · 2026-01-05T02:55:53Z

isaaclab_arena/remote_policy/policy_client.py

+        # if options is not None:
+        #     payload["options"] = options


Intentional? options is currently unused.

remove options，right now it is not used. When I was developing this, I considered that if some special information needed to be transmitted, an extra options field might be necessary. Since it is not needed at the moment, I have removed it.

alexmillane · 2026-01-05T02:59:38Z

isaaclab_arena/remote_policy/policy_client.py

+        if isinstance(resp, dict) and "action" in resp:
+            return resp["action"]
+        return resp


Why have this optional unwrapping? Can we assume one form of the response? Either the action dict, or a dict containing the action dict? Is there a reason to accept both?

also removed

alexmillane · 2026-01-05T03:08:11Z

isaaclab_arena_gr00t/policy_config.py

+        if not Path(self.model_path).exists():
+            warnings.warn(
+                "[GR00TConfig] model_path does not exist: "
+                f"{self.model_path}. No model checkpoint was found. "
+                "If this is the client side of a remote policy, this warning can be ignored. "
+                "However, if you are running a local policy or the server side of a remote policy, "
+                "the program will fail to load the model and cannot run correctly."
+            )


Suggestion to remove this warning altogether. From the message the warning will certainly fire in cases where nothing is wrong. That's pretty confusing for the user, I feel, so I'd suggest that we remove this check here. Perhaps we could move the check to a more appropriate place? For example in the local policy where we know we need a valid model path.

OK ，I remove it now，we can add it once we have fully separated the policy and the client.

alexmillane · 2026-01-05T03:31:07Z

isaaclab_arena/remote_policy/remote_policy_server_runner.py

+POLICY_REGISTRY: dict[str, str] = {
+    # policy_type: "module_path:ClassName"
+    "gr00t_closedloop": "isaaclab_arena_gr00t.gr00t_remote_policy:Gr00tRemoteModelPolicy",
+}


I'm thinking we should generalize and use our registry class used for Assets, Devices, and Retargetters for Policies too.

But I agree with the idea here: let's have a policy registry to allow users to register policies, to free the core code from directly depending on policy code. I think that this will be critical to getting policy (currently gr00t) code out of the core framework.

I add a registry class in remote_policy folder .please check

Thank you for doing that. Unfortunately we've double-done the work 🥲 . I added a new registry as part of moving the gr00t code out of the core package in #316 . I would suggest that you use my registry (which utilizes the same machinery we use for registering assets etc.)

alexmillane · 2026-01-05T03:36:39Z

isaaclab_arena/remote_policy/remote_policy_config.py

Suggestion to rename to remote_policy_config.py

kanghui0204 · 2026-01-12T05:46:38Z

Thank you for putting this together.

Most of my comments are minor syntactic things that we can easily address.

I have one major comment on the design. At the moment, the server-client split point leaves half of gr00t policy running on the client and half (the model itself) running on the server. To me it would make sense to move the split point such that the whole policy is launched on the server. That way, in the remote inference case, we remove all gr00t details from the client side. It also simplifies the server side - the server just launches the same policy (called Gr00tClosedloopPolicy) that is launched in the local case.

This diagram explains (a simplified version) of the current design and the alternative that I'm proposing.

Hi @alexmillane , I’ve reintroduced the ActionChunkingClientPolicy class to separate the client and server. The client now uses ActionChunkingClientPolicy to record action chunks and no longer needs to import anything from the model. During initialization, however, the server needs to send the client information such as the action dimension, action chunk length, and observation keys. This allows ActionChunkingClientPolicy to initialize the correct tensors and select the appropriate observations to send, without converting all observations to NumPy and sending everything to the server. In addition, in the interaction between client and server, the task_description must be set on the client side; once it is set, the client sends the task_description to the server so the server can store it. Could you help review the current implementation? If it looks good, I can clean up and organize the code.

kanghui0204 · 2026-01-12T05:50:13Z

for now the workflow is like this picture.

xyao-nv · 2026-01-13T00:04:34Z

isaaclab_arena_gr00t/policy/gr00t_closedloop_policy.py

        self.action_dim = len(self.robot_action_joints_config)
        if self.task_mode == TaskMode.G1_LOCOMANIPULATION:
-            # GR00T outputs are used for WBC inputs dim. So adding WBC commands to the action dim.
-            # WBC commands: navigate_command, base_height_command, torso_orientation_rpy_command


Can you keep those comments?

xyao-nv · 2026-01-13T00:09:47Z

isaaclab_arena_gr00t/policy/gr00t_remote_policy.py

+from isaaclab_arena_gr00t.utils.robot_joints import JointsAbsPosition
+
+
+class Gr00tRemoteServerSidePolicy(ServerSidePolicy):


A lot of Gr00t core codes from gr00t_closedloop_policy.py are duplicated in this class. Can you modify this class such that if there is any changes for the GR00T core logics (input prep, model forward to get predictions, and postprocess), there will be only 1 file needed to modify.

xyao-nv · 2026-01-13T00:13:29Z

isaaclab_arena/policy/action_chunking_client.py

+        if env_ids is None:
+            env_ids = slice(None)
+        self.current_action_chunk[env_ids] = 0.0
+        self.current_action_index[env_ids] = -1
+        self.env_requires_new_action_chunk[env_ids] = True
+
+        if env_ids is None:
+            env_ids = torch.arange(self.num_envs, device=self.device, dtype=torch.long)
+
+        self.current_action_chunk[env_ids] = 0.0
+        self.current_action_index[env_ids] = -1
+        self.env_requires_new_action_chunk[env_ids] = True


Can you remove duplication?

xyao-nv · 2026-01-13T00:22:23Z

isaaclab_arena/remote_policy/remote_policy_server_runner.py

+from isaaclab_arena.remote_policy.policy_server import PolicyServer
+
+
+def get_policy_cls(policy_type: str) -> type["ServerSidePolicy"]:


Is the function copied? Is there any one to have one version of the func being kept in the repo?

xyao-nv · 2026-01-13T00:28:02Z

isaaclab_arena/policy/action_chunking_client.py

+        packed[key_path] = value
+    return packed
+
+class ActionChunkingClientPolicy(PolicyBase):


I wonder if it makes sense to have ActionChunkingClientPolicy inherited ClientPolicy, ClientPolicy inheriates PolicyBase.
ActionChunking is a specific post processing method used by major VLAs, there are others like avg smoothing method too. The simplest postprocessing could just be passing through, aka doing nothing for single-obs-single-action.

Hi @xyao-nv , Let me check if I understand you correctly: you are suggesting a hierarchy like PolicyBase -> ClientPolicy -> post-processing policies (ActionChunking, avg smoothing, and others?). In this design, ClientPolicy would handle the networking part, and each post-processing policy would focus on how to handle the actions, right?

Could you share some code examples for avg smoothing and other post-processing methods? I’d like to take a look, because in the current workflow the server reads the config and sends it to the client, and my current get_init_info only transmits the action chunk length, which might not be sufficient for other post-processing strategies.

xyao-nv · 2026-01-13T00:34:26Z

Thanks for updating your commits! One major concern is from code maintenance and future dev. Esp for GR00T core logics in terms of preprocess obs, get predictions, and postprocess predictions, I observed there are a significant amount of code duplication for those, partially because of the server-client setup.

This is very risky and not ideal at all. When we have two versions of the same logic with slight modifications, we double the surface area for bugs and make future updates twice as risky.

To address this, we should aim for Single Source of Truth (SSOT) by abstracting the common logic and injecting the "slight modifications" as parameters or strategies.

Other, I prefer having a sample test setup as well. As our repo is OSS, every single commit is public and community will suffer from poorly-tested features. Also it prevents your added feature are broken by us, mostly me, when upgrading to GN1.6.

kanghui0204 · 2026-01-13T04:37:40Z

Thanks for updating your commits! One major concern is from code maintenance and future dev. Esp for GR00T core logics in terms of preprocess obs, get predictions, and postprocess predictions, I observed there are a significant amount of code duplication for those, partially because of the server-client setup.

This is very risky and not ideal at all. When we have two versions of the same logic with slight modifications, we double the surface area for bugs and make future updates twice as risky.

To address this, we should aim for Single Source of Truth (SSOT) by abstracting the common logic and injecting the "slight modifications" as parameters or strategies.

Other, I prefer having a sample test setup as well. As our repo is OSS, every single commit is public and community will suffer from poorly-tested features. Also it prevents your added feature are broken by us, mostly me, when upgrading to GN1.6.

Hi @xyao-nv , thank you very much for your review.

In my understanding, if this PR is merged, then gr00t_closedloop_policy would no longer be needed, since IsaacLab-Arena and the policy would be completely decoupled. This would help keep the IsaacLab side of the code clean. What do you think about that?

Regarding tests, this PR does need to add them. However, the current version is mainly for you to check whether this server/client workflow (the server loading the policy config, the init config being sent from server to client, the task text being sent from client to server, and how observations are transmitted) aligns with IsaacLab-Arena’s logic and future extensibility. If this workflow looks reasonable to you, I will then do a full cleanup of the code, update the documentation, and add unit tests.

xyao-nv · 2026-01-14T01:37:17Z

Regarding client side post processing techniques, not all policies return action chunks. Token-based predictions may just output one action at all time.
E.g. OpenVLA, Molmo, older models like RT-2.
Even for diffusion policies which return the chunk, it's up to users to decide how they want to consume it. Either consuming 1 by 1 in this chunk, or using weighted avg, or applying exponential decay or Gaussian filter using a rolling window etc. There is no standards in terms of how to consume it. I think it is a question among factors like how fast/smooth VLA predicts, how fast robot reacts, how fast env changes. When deploying VLA models, it essentially becomes a design choice.
As writing the policy client class, I'd prefer as a core framework, it shall relax the assumption that there is a standard way of consuming action chunks.

xyao-nv · 2026-01-14T01:45:40Z

As you are thinking of retiring gr00t_closedloop_policy and switching to the remote in one MR, it's better to keep in mind there will be a series of changes coming,

by early next week, I hope I can finish upgrading everything to GN1.6
by early Feb, we need everything need to show for GTC DLI Lab ready, and we are showing a couple E2E GR00T tasks eval.
by end of this week, I am derisking the arena + gr00t deps setup on the DLI Lab dev env. I won't have any bandwidth to enable this awesome client-server arch for that DLI Lab in late Jan, given it probably wont be ready in a week and we have the list of todos.

So I would prefer, if you can do a 2-phase approach, adding client-server with policy core logics consolidation & tests & doc in one MR, and retiring our previous imple in another MR. Does it make sense? Or am I asking too much?

xyao-nv · 2026-01-14T01:51:48Z

Re this server-client workflow (the server loading the policy config, the init config being sent from server to client, the task text being sent from client to server, and how observations are transmitted), it makes sense to me esp setting task description and init cfg.
I would like to request refactoring for a better design on the ActionChunkingPolicyClass.

One more question, it looks to me your imple inremote_policy shall be pretty generic, is there any difference when adding sever supports to VLN vs. VLA? Is there any specific funcs VLN needs?

kanghui0204 · 2026-01-14T02:28:39Z

As you are thinking of retiring gr00t_closedloop_policy and switching to the remote in one MR, it's better to keep in mind there will be a series of changes coming,

by early next week, I hope I can finish upgrading everything to GN1.6

by early Feb, we need everything need to show for GTC DLI Lab ready, and we are showing a couple E2E GR00T tasks eval.

by end of this week, I am derisking the arena + gr00t deps setup on the DLI Lab dev env. I won't have any bandwidth to enable this awesome client-server arch for that DLI Lab in late Jan, given it probably wont be ready in a week and we have the list of todos.

So I would prefer, if you can do a 2-phase approach, adding client-server with policy core logics consolidation & tests & doc in one MR, and retiring our previous imple in another MR. Does it make sense? Or am I asking too much?

OK, I agree with you. I can split it into two phases.

kanghui0204 · 2026-01-14T02:30:20Z

Re this server-client workflow (the server loading the policy config, the init config being sent from server to client, the task text being sent from client to server, and how observations are transmitted), it makes sense to me esp setting task description and init cfg. I would like to request refactoring for a better design on the ActionChunkingPolicyClass.

One more question, it looks to me your imple inremote_policy shall be pretty generic, is there any difference when adding sever supports to VLN vs. VLA? Is there any specific funcs VLN needs?

Actually, the current action post-processing in navila is somewhat similar to action chunking, but instead of generating many actions, it generates a single action that is executed multiple steps in the Isaac environment, so with my current implementation this can be adjusted on the server side.

kanghui0204 · 2026-01-19T02:18:17Z

Hi @alexmillane @xyao-nv , I’ve redesigned the server/client APIs and code. In this new API, we can decouple the client and server code, and it also supports extending other action post-processing methods. Could you help review it?

alexmillane

Hi @kanghui0204. Great work. This is starting to look really good. I think at this point the design is good. Thanks for your work in improving it.

I have left a bunch of small comments. The slightly larger ones are to do with. Some of the duplicated logic in the gr00t server-side policy and the action chunking. As @xyao-nv points out, because of some tight deadlines with GTC, we're likely to have to maintain both a local gr00t policy alongside the gr00t server. So it would be great if you could think of a way of getting rid of this duplication. What do you think?

alexmillane · 2026-01-19T02:51:07Z

docker/setup/install_gr00t_deps.sh

+if [[ "$USE_SERVER_ENV" -eq 1 ]]; then
+  # Script to install GR00T policy dependencies
+  # This script is called from the GR00T server Dockerfile
+
+  # CUDA environment variables for GR00T installation.
+  # In the PyTorch base image, CUDA is already configured, so we only
+  # set variables if CUDA_HOME exists.
+  if [ -d "/usr/local/cuda" ]; then
+      export CUDA_HOME=${CUDA_HOME:-/usr/local/cuda}
+      export PATH=${CUDA_HOME}/bin:${PATH}
+      export LD_LIBRARY_PATH=${CUDA_HOME}/lib64:${LD_LIBRARY_PATH:-}
+  fi
+
+  echo "CUDA environment variables:"
+  echo "CUDA_HOME=${CUDA_HOME:-unset}"
+  echo "PATH=$PATH"
+  echo "LD_LIBRARY_PATH=${LD_LIBRARY_PATH:-unset}"
+else
+  # Script to install GR00T policy dependencies
+  # This script is called from the Dockerfile when INSTALL_GROOT is true
+
+  # Set CUDA environment variables for GR00T installation
+  export CUDA_HOME=/usr/local/cuda-12.8
+  export PATH=/usr/local/cuda-12.8/bin:${PATH}
+  export LD_LIBRARY_PATH=/usr/local/cuda-12.8/lib64:${LD_LIBRARY_PATH:-}
+  export TORCH_CUDA_ARCH_LIST=8.0+PTX

-echo "CUDA environment variables set:"
-echo "CUDA_HOME=$CUDA_HOME"
-echo "PATH=$PATH"
-echo "LD_LIBRARY_PATH=$LD_LIBRARY_PATH"
-echo "TORCH_CUDA_ARCH_LIST=$TORCH_CUDA_ARCH_LIST"
+  echo "CUDA environment variables set:"
+  echo "CUDA_HOME=$CUDA_HOME"
+  echo "PATH=$PATH"
+  echo "LD_LIBRARY_PATH=$LD_LIBRARY_PATH"
+  echo "TORCH_CUDA_ARCH_LIST=$TORCH_CUDA_ARCH_LIST"
+fi


I think we can get rid of some duplication here. Actually I'm not sure I see any difference between then the branches of the conditionals.

For example:

CUDA_HOME=/usr/local/cuda-12.8 can be replaced with CUDA_HOME=/usr/local/cuda (which is a symlink to /usr/local/cuda-12.8 that always exists.

export LD_LIBRARY_PATH=/usr/local/cuda-12.8/lib64:${LD_LIBRARY_PATH:-} can be replaced with export LD_LIBRARY_PATH={CUDA_HOME}/lib64:${LD_LIBRARY_PATH:-}

The echo statements are the same between both branches and can be moved outside the conditional statements.

etc.

After these modifications these two branches of the conditional look the same to me? What was the idea here?

44 # Set CUDA environment variables for GR00T installation 45 export CUDA_HOME=/usr/local/cuda-12.8 46 export PATH=/usr/local/cuda-12.8/bin:${PATH} 47 export LD_LIBRARY_PATH=/usr/local/cuda-12.8/lib64:${LD_LIBRARY_PATH:-} 48 export TORCH_CUDA_ARCH_LIST=8.0+PTX 49 50 echo "CUDA environment variables set:" 51 echo "CUDA_HOME=$CUDA_HOME" 52 echo "PATH=$PATH" 53 echo "LD_LIBRARY_PATH=$LD_LIBRARY_PATH" 54 echo "TORCH_CUDA_ARCH_LIST=$TORCH_CUDA_ARCH_LIST"

Actually, this part of the code comes from the existing GR00T installation process in IsaacLab Arena. It’s indeed a bit different from the standard setup (it uses /usr/local/cuda-12.8), so I didn’t modify it. Because of this special setup, I wasn’t sure if it was required for a specific reason, so I kept a separate branch.

Yea, I don't see the difference between the branches of this if/else condition. The only difference is /usr/local/cuda-12.8 vs /usr/local/cuda. On all systems I've worked on, these two are symlinked to one another. So I assume we can just use /usr/local/cuda.

But in the interests of getting this in, I'm happy to fix this after merging.

alexmillane · 2026-01-19T02:57:26Z

docker/Dockerfile.gr00t_server

+
+ENV PYTHONPATH=${WORKDIR}
+
+ENTRYPOINT ["python", "-u", "-m", "isaaclab_arena.remote_policy.remote_policy_server_runner"]


alexmillane · 2026-01-19T02:59:32Z

docker/run_gr00t_server.sh

+# Other parameters (can also be overridden via environment variables)
+HOST="${HOST:-0.0.0.0}"
+PORT="${PORT:-5555}"
+API_TOKEN="${API_TOKEN:-API_TOKEN_123}"


Not sure I understand this default. Shall we make that the default is no API token checking? Wdyt?

alexmillane · 2026-01-19T03:03:04Z

isaaclab_arena/policy/action_chunking_client.py

+                f"got {self.action_mode.value!r}."
+            )
+
+        proto: ChunkingActionProtocol = self.protocol  # type: ignore


Appears unused?

alexmillane · 2026-01-19T03:10:58Z

isaaclab_arena/policy/action_chunking_client.py

+    def from_args(args: argparse.Namespace) -> "ActionChunkingClientSidePolicy":
+        """Create an ActionChunkingClientSidePolicy from CLI arguments."""
+        remote_config = ClientSidePolicy.build_remote_config_from_args(args)
+        dummy_config: Any = None  # Replace with real config dataclass if needed.


Is this intended? I'm wondering what the purpose of this variable is, and how a user could make use of it?

policybase requires a config to be passed in, but since action chunking doesn’t use it, I pass in None.

alexmillane · 2026-01-19T04:20:14Z

isaaclab_arena/remote_policy/policy_client.py

+from .message_serializer import MessageSerializer
+from .remote_policy_config import RemotePolicyConfig


Is there a reason to use relative imports here? Perhaps this is because of the only partial install we're doing of isaaclab_arena in the server case?

I would recommend against the partial imports. I suggest we instead install the full (python) package (even though it will likely not pass tests due to missing dependencies on the server).

add absolute path

alexmillane · 2026-01-19T04:22:20Z

isaaclab_arena/policy/action_chunking_client.py

I'm wondering about the split of files between isaaclab_arena/policy and isaaclab_arena/remote_policy. There are a couple of files, which seem related to remote inference, that are in this folder.

Could we move them to the remote_policy folder? Wdyt?

Do you mean the two files action_chunking_client.py and client_side_policy.py? I actually see the corresponding ClientSidePolicy and ActionChunkingClientSidePolicy (together with ReplayActionPolicy) as client-side policies, so I placed them in the policy folder.

There is also another concern: both ClientSidePolicy and ActionChunkingClientSidePolicy inherit from PolicyBase. If we move them into the remote_policy folder, the remote policy code would need to import PolicyBase from the policy package, which introduces an unnecessary dependency. What do you think?

I see your point. There are dependencies between these two folders. What do you think about just making remote_policy a subfolder of policy?

What do you think about that?

Hi @alexmillane, remote_policy does not depend on anything inside the policy folder.
In contrast, the init in the policy folder does from .replay_action_policy import *, and replay_action_policy imports IsaacLab code.

When the server side starts, it needs to copy remote_policy into its environment.
If it instead copies the whole policy folder, it will also bring in other code, which can cause the program to crash.

For this reason, I feel that remote_policy should not be placed inside policy. What do you think?

alexmillane · 2026-01-19T07:44:01Z

isaaclab_arena/remote_policy/policy_server.py

+        print(f"[PolicyServer] binding on {bind_addr}")
+        self._socket.bind(bind_addr)
+        self._api_token = api_token
+        self._serializer = MessageSerializer


nit: storing a class with static methods like this appears a bit.. strange. It's a nit, but I'd suggest just calling MessageSerializer.serialize(msg), because that clearly communicates that these are static methods.

alexmillane · 2026-01-19T07:56:08Z

isaaclab_arena/remote_policy/policy_server.py

+                if not self._validate_token(request):
+                    self._socket.send(
+                        self._serializer.to_bytes({"error": "Unauthorized: invalid api_token"})
+                    )
+                    continue


We're you inspired by this API token design from somewhere else?

What's the intention? Because we transmit the API token unencrypted we don't get any security, so I'm wondering what the intention is?

The API token is not intended to provide real security. The goal is just to avoid accidental connections or mixed-up processes on the same network (i.e. a lightweight guardrail), not to protect against an active attacker, since the token is sent in plaintext.

Ok let's leave it as is.

alexmillane · 2026-01-19T08:02:31Z

isaaclab_arena/remote_policy/policy_server.py

+                    )
+                    continue
+
+                endpoint = request.get("endpoint", "get_action")


nit: we have a default here. I would have thought that all requests must include an endpoint. I would suggest enforcing that by not having a default. (and probably assert "endpoint" in request).

kanghui0204 · 2026-01-21T02:55:54Z

Hi @alexmillane , I’ve updated some of the code based on your comments. Could you please take another look, especially at the action protocol part? There are also a few comments I didn’t change, but I added my reasoning there—could you check whether it seems reasonable?

The next step is to merge the Gr00t remote and local code, and then add docs and tests.

alexmillane

I'm fine with the changes!

alexmillane · 2026-01-22T19:34:01Z

docker/setup/install_gr00t_deps.sh

+if [[ "$USE_SERVER_ENV" -eq 1 ]]; then
+  # Script to install GR00T policy dependencies
+  # This script is called from the GR00T server Dockerfile
+
+  # CUDA environment variables for GR00T installation.
+  # In the PyTorch base image, CUDA is already configured, so we only
+  # set variables if CUDA_HOME exists.
+  if [ -d "/usr/local/cuda" ]; then
+      export CUDA_HOME=${CUDA_HOME:-/usr/local/cuda}
+      export PATH=${CUDA_HOME}/bin:${PATH}
+      export LD_LIBRARY_PATH=${CUDA_HOME}/lib64:${LD_LIBRARY_PATH:-}
+  fi
+
+  echo "CUDA environment variables:"
+  echo "CUDA_HOME=${CUDA_HOME:-unset}"
+  echo "PATH=$PATH"
+  echo "LD_LIBRARY_PATH=${LD_LIBRARY_PATH:-unset}"
+else
+  # Script to install GR00T policy dependencies
+  # This script is called from the Dockerfile when INSTALL_GROOT is true
+
+  # Set CUDA environment variables for GR00T installation
+  export CUDA_HOME=/usr/local/cuda-12.8
+  export PATH=/usr/local/cuda-12.8/bin:${PATH}
+  export LD_LIBRARY_PATH=/usr/local/cuda-12.8/lib64:${LD_LIBRARY_PATH:-}
+  export TORCH_CUDA_ARCH_LIST=8.0+PTX

-echo "CUDA environment variables set:"
-echo "CUDA_HOME=$CUDA_HOME"
-echo "PATH=$PATH"
-echo "LD_LIBRARY_PATH=$LD_LIBRARY_PATH"
-echo "TORCH_CUDA_ARCH_LIST=$TORCH_CUDA_ARCH_LIST"
+  echo "CUDA environment variables set:"
+  echo "CUDA_HOME=$CUDA_HOME"
+  echo "PATH=$PATH"
+  echo "LD_LIBRARY_PATH=$LD_LIBRARY_PATH"
+  echo "TORCH_CUDA_ARCH_LIST=$TORCH_CUDA_ARCH_LIST"
+fi


Yea, I don't see the difference between the branches of this if/else condition. The only difference is /usr/local/cuda-12.8 vs /usr/local/cuda. On all systems I've worked on, these two are symlinked to one another. So I assume we can just use /usr/local/cuda.

But in the interests of getting this in, I'm happy to fix this after merging.

alexmillane · 2026-01-22T20:19:48Z

isaaclab_arena/policy/client_side_policy.py

+        if self.requested_action_mode is ActionMode.NONE:
+            raise NotImplementedError(
+                f"{self.__class__.__name__} must set 'requested_action_mode', "
+                f"can't be {ActionMode.NONE}."
+            )
+        if self.protocol_cls is None:
+            raise NotImplementedError(
+                f"{self.__class__.__name__} must set 'protocol_cls' "
+                f"to a concrete ActionProtocol subclass."
+            )


looks good to me.

alexmillane · 2026-01-22T20:21:32Z

isaaclab_arena/remote_policy/message_serializer.py

+        """Decode tagged structures created in _encode_custom."""
+        if not isinstance(obj, dict):
+            return obj
+
+        # numpy array
+        if "__ndarray_class__" in obj:
+            return np.load(io.BytesIO(obj["as_npy"]), allow_pickle=False)
+
+        # generic binary blob
+        if "__blob_class__" in obj:
+            return {
+                "mime": obj.get("mime"),
+                "data": obj.get("as_bytes"),
+            }
+
+        # other tagged types can be added here
+        return obj


That makes sense to me. Thank you for clarifying

alexmillane · 2026-01-22T20:31:01Z

isaaclab_arena/policy/action_chunking_client.py

I see your point. There are dependencies between these two folders. What do you think about just making remote_policy a subfolder of policy?

What do you think about that?

alexmillane · 2026-01-22T20:31:30Z

isaaclab_arena/remote_policy/policy_server.py

+                if not self._validate_token(request):
+                    self._socket.send(
+                        self._serializer.to_bytes({"error": "Unauthorized: invalid api_token"})
+                    )
+                    continue


Ok let's leave it as is.

kanghui0204 · 2026-01-23T16:28:14Z

Hi @alexmillane @xyao-nv ，Regarding merging the Gr00t code, I have factored out a isaaclab_arena_gr00t/policy/gr00t_core.py file to collect the shared logic, and both Gr00tClosedloopPolicy and Gr00tRemoteServerSidePolicy now depend on it.
Could you please take a look and let me know if this structure works for you?

I also added documentation for the remote policy; could you review that as well?

Finally, how should I trigger this PR’s CI, and how is your CI scheduled?
For this kind of server/client execution pattern, what is the recommended way for me to integrate it into the CI?

Right now this PR doesn’t add any new tests to the CI, but it still can’t trigger the CI.
It seems to fail with the error:
“Error: The template is not valid. .github/workflows/ci.yml (Line: 103, Col: 19): Unexpected value ''”, even though I haven’t modified this file. Is this happening because of something specific to my forked repository, or could it be related to permissions?

xyao-nv · 2026-01-27T07:19:56Z

GR00TN1.6 upgrades are merged. Sorry for holding you off, please rebase on main. #334

kanghui0204 force-pushed the socket_for_policy branch from b65f8cb to f542ea6 Compare December 30, 2025 13:10

alexmillane changed the base branch from release/0.1.1 to main January 4, 2026 19:18

alexmillane changed the base branch from main to release/0.1.1 January 4, 2026 19:20

alexmillane reviewed Jan 5, 2026

View reviewed changes

kanghui0204 force-pushed the socket_for_policy branch from f542ea6 to 698c2bf Compare January 5, 2026 06:25

kanghui0204 changed the base branch from release/0.1.1 to main January 5, 2026 06:28

kanghui0204 force-pushed the socket_for_policy branch 4 times, most recently from e59b3d6 to 111d093 Compare January 12, 2026 04:32

kanghui0204 force-pushed the socket_for_policy branch from 111d093 to a6a44d8 Compare January 12, 2026 10:06

xyao-nv reviewed Jan 13, 2026

View reviewed changes

kanghui0204 force-pushed the socket_for_policy branch from a6a44d8 to 6bbadc5 Compare January 13, 2026 04:39

kanghui0204 force-pushed the socket_for_policy branch 2 times, most recently from efac3db to 66dae62 Compare January 19, 2026 00:59

kanghui0204 force-pushed the socket_for_policy branch from 66dae62 to 5a4ce83 Compare January 19, 2026 02:50

alexmillane reviewed Jan 19, 2026

View reviewed changes

kanghui0204 force-pushed the socket_for_policy branch from 5a4ce83 to a6ab404 Compare January 21, 2026 02:50

kanghui0204 force-pushed the socket_for_policy branch from a6ab404 to 9194424 Compare January 22, 2026 15:13

alexmillane reviewed Jan 22, 2026

View reviewed changes

kanghui0204 force-pushed the socket_for_policy branch 2 times, most recently from 7e0333b to 8cba6be Compare January 23, 2026 16:17

hkang-nv added 2 commits January 23, 2026 17:01

add socket code for policy

53d9858

Move Gr00t policy to server side

6fe0f95

kanghui0204 force-pushed the socket_for_policy branch from 8cba6be to 6fe0f95 Compare January 24, 2026 01:01

		from isaaclab_arena_gr00t.utils.robot_joints import JointsAbsPosition


		class Gr00tRemoteServerSidePolicy(ServerSidePolicy):

		from isaaclab_arena.remote_policy.policy_server import PolicyServer


		def get_policy_cls(policy_type: str) -> type["ServerSidePolicy"]:


		ENV PYTHONPATH=${WORKDIR}

		ENTRYPOINT ["python", "-u", "-m", "isaaclab_arena.remote_policy.remote_policy_server_runner"]

		from .message_serializer import MessageSerializer
		from .remote_policy_config import RemotePolicyConfig

[FEATURE] remote policy with server/client #310

Are you sure you want to change the base?

[FEATURE] remote policy with server/client #310

Uh oh!

Conversation

kanghui0204 commented Dec 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

1. Functionality

2. How to use it

3. Current example

4. What remains to be done? Future work

Uh oh!

alexmillane left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kanghui0204 commented Jan 12, 2026

Uh oh!

kanghui0204 commented Jan 12, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

xyao-nv commented Jan 13, 2026

Uh oh!

kanghui0204 commented Jan 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

kanghui0204 commented Dec 30, 2025 •

edited

Loading

kanghui0204 commented Jan 13, 2026 •

edited

Loading

xyao-nv commented Jan 14, 2026 •

edited

Loading