Universal Manipulation Interface

[Project page] [Paper] [Hardware Guide] [Data Collection Instruction] [SLAM repo] [SLAM docker]

Cheng Chi^1,2, Zhenjia Xu^1,2, Chuer Pan¹, Eric Cousineau³, Benjamin Burchfiel³, Siyuan Feng³,

Russ Tedrake³, Shuran Song^1,2

¹Stanford University, ²Columbia University, ³Toyota Research Institute

🛠️ Installation

Supported Platforms: macOS (Apple Silicon recommended), Linux (Ubuntu 22.04+).

1. System Dependencies

We provide a helper script to install required system dependencies (ffmpeg, exiftool, uv) and setup the environment.

$ bash setup_deps.sh

This script will:

Install uv (if missing).
Install ffmpeg and exiftool (via Homebrew on macOS, or check availability on Linux).
Create a virtual environment and sync core dependencies.

2. Activate Environment

$ source .venv/bin/activate
(umi-workspace) $

3. (Optional) Training Dependencies

To install heavy training libraries (Torch GPU, Diffusion Policy, Gym, MuJoCo) for simulation or training:

(umi-workspace) $ uv sync --extra train
(umi-workspace) $ uv pip install -e packages/diffusion_policy

Note: This might require additional system dependencies depending on your OS (e.g. libosmesa6-dev on Linux).

Running UMI SLAM pipeline

Download example data

(umi-workspace) $ wget --recursive --no-parent --no-host-directories --cut-dirs=2 --relative --reject="index.html*" https://real.stanford.edu/umi/data/example_demo_session/

Run SLAM pipeline

(umi-workspace) $ python run_slam_pipeline.py example_demo_session

...
Found following cameras:
camera_serial
C3441328164125    5
Name: count, dtype: int64
Assigned camera_idx: right=0; left=1; non_gripper=2,3...
             camera_serial  gripper_hw_idx                                     example_vid
camera_idx                                                                                
0           C3441328164125               0  demo_C3441328164125_2024.01.10_10.57.34.882133
99% of raw data are used.
defaultdict(<function main.<locals>.<lambda> at 0x7f471feb2310>, {})
n_dropped_demos 0

For this dataset, 99% of the data are useable (successful SLAM), with 0 demonstrations dropped. If your dataset has a low SLAM success rate, double check if you carefully followed our data collection instruction.

Despite our significant effort on robustness improvement, OBR_SLAM3 is still the most fragile part of UMI pipeline. If you are an expert in SLAM, please consider contributing to our fork of OBR_SLAM3 which is specifically optimized for UMI workflow.

Generate dataset for training.

(umi-workspace) $ python scripts_slam_pipeline/07_generate_replay_buffer.py -o example_demo_session/dataset.zarr.zip example_demo_session

Training Diffusion Policy

Requires training dependencies installed (uv sync --extra train).

Single-GPU training. Tested to work on RTX3090 24GB.

(umi-workspace) $ python train.py --config-name=train_diffusion_unet_timm_umi_workspace task.dataset_path=example_demo_session/dataset.zarr.zip

Multi-GPU training.

(umi-workspace) $ accelerate --num_processes <ngpus> train.py --config-name=train_diffusion_unet_timm_umi_workspace task.dataset_path=example_demo_session/dataset.zarr.zip

Downloading in-the-wild cup arrangement dataset (processed).

(umi-workspace) $ wget https://real.stanford.edu/umi/data/zarr_datasets/cup_in_the_wild.zarr.zip

Multi-GPU training.

(umi-workspace) $ accelerate --num_processes <ngpus> train.py --config-name=train_diffusion_unet_timm_umi_workspace task.dataset_path=cup_in_the_wild.zarr.zip

🦾 Real-world Deployment

In this section, we will demonstrate our real-world deployment/evaluation system with the cup arrangement policy. While this policy setup only requires a single arm and camera, the our system supports up to 2 arms and unlimited number of cameras.

⚙️ Hardware Setup

Build deployment hardware according to our Hardware Guide.
Setup UR5 with teach pendant:
- Obtain IP address and update eval_robots_config.yaml/robots/robot_ip.
- In Installation > Payload
  - Set mass to 1.81 kg
  - Set center of gravity to (2, -6, 37)mm, CX/CY/CZ.
- TCP will be set automatically by the eval script.
- On UR5e, switch control mode to remote.
If you are using Franka, follow this instruction.
Setup WSG50 gripper with web interface:
- Obtain IP address and update eval_robots_config.yaml/grippers/gripper_ip.
- In Settings > Command Interface
  - Disable "Use text based Interface"
  - Enable CRC
- In Scripting > File Manager
  - Upload umi/real_world/cmd_measure.lua
- In Settings > System
  - Enable Startup Script
  - Select /user/cmd_measure.lua you just uploaded.
Setup GoPro:
- Install GoPro Labs firmware.
- Set date and time.
- Scan the following QR code for clean HDMI output
Setup 3Dconnexion SpaceMouse:
- Install libspnav sudo apt install libspnav-dev spacenavd
- Start spnavd sudo systemctl start spacenavd

🤗 Reproducing the Cup Arrangement Policy ☕

Our in-the-wild cup arragement policy is trained with the distribution of "espresso cup with saucer" on Amazon across 30 different locations around Stanford. We created a Amazon shopping list for all cups used for training. We published the processed Zarr dataset and pre-trained checkpoint (finetuned CLIP ViT-L backbone).

Download pre-trained checkpoint.

(umi)$ wget https://real.stanford.edu/umi/data/pretrained_models/cup_wild_vit_l_1img.ckpt

Grant permission to the HDMI capture card.

(umi)$ sudo chmod -R 777 /dev/bus/usb

Launch eval script.

(umi)$ python eval_real.py --robot_config=example/eval_robots_config.yaml -i cup_wild_vit_l.ckpt -o data/eval_cup_wild_example

After the script started, use your spacemouse to control the robot and the gripper (spacemouse buttons). Press C to start the policy. Press S to stop.

If everything are setup correctly, your robot should be able to rotate the cup and placing it onto the saucer, anywhere 🎉

Known issue ⚠️: The policy doesn't work well under direct sunlight, since the dataset was collected during a rainiy week at Stanford.

🤗 Reproducing Policies on ARX X5 Robot Arms

Please follow umi-on-legs for hardware modification and umi-arx for detailed policy deployment instructions.

🏷️ License

This repository is released under the MIT license. See LICENSE for additional details.

🙏 Acknowledgement

Our GoPro SLAM pipeline is adapted from Steffen Urban's fork of OBR_SLAM3.
We used Steffen Urban's OpenImuCameraCalibrator for camera and IMU calibration.
The UMI gripper's core mechanism is adpated from Push/Pull Gripper by John Mulac.
UMI's soft finger is adapted from Alex Alspach's original design at TRI.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
assets		assets
example		example
packages		packages
scripts		scripts
scripts_real		scripts_real
scripts_slam_pipeline		scripts_slam_pipeline
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
conda_environment.yaml		conda_environment.yaml
eval_real.py		eval_real.py
franka_instruction.md		franka_instruction.md
pyproject.toml		pyproject.toml
run_slam_pipeline.py		run_slam_pipeline.py
setup_deps.sh		setup_deps.sh
train.py		train.py
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Universal Manipulation Interface

🛠️ Installation

1. System Dependencies

2. Activate Environment

3. (Optional) Training Dependencies

Running UMI SLAM pipeline

Training Diffusion Policy

🦾 Real-world Deployment

⚙️ Hardware Setup

🤗 Reproducing the Cup Arrangement Policy ☕

🤗 Reproducing Policies on ARX X5 Robot Arms

🏷️ License

🙏 Acknowledgement

About

Uh oh!

Releases

Packages

Languages

License

Sentient-X/universal_manipulation_interface

Folders and files

Latest commit

History

Repository files navigation

Universal Manipulation Interface

🛠️ Installation

1. System Dependencies

2. Activate Environment

3. (Optional) Training Dependencies

Running UMI SLAM pipeline

Training Diffusion Policy

🦾 Real-world Deployment

⚙️ Hardware Setup

🤗 Reproducing the Cup Arrangement Policy ☕

🤗 Reproducing Policies on ARX X5 Robot Arms

🏷️ License

🙏 Acknowledgement

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages