video2robot

End-to-end pipeline: Video (or Prompt) → Human Pose Extraction → Robot Motion Conversion

Demo

fullvideo.mp4

Demo Video

backflip.mp4

Backflip

new_jeans.mp4

Dance Motion

Pipeline

[Prompt] → Veo → [Video] → PromptHMR → [SMPL-X] → GMR → [Robot Motion]

Project Structure

video2robot/
├── video2robot/            # Main package
│   ├── config.py           # Configuration management
│   ├── pipeline.py         # (Optional) Python API pipeline
│   ├── cli.py              # Console entrypoint for installation
│   ├── video/              # Video generation/processing
│   │   └── veo_client.py   # Google Veo API
│   ├── pose/               # Pose extraction (PromptHMR wrapper)
│   │   └── extractor.py
│   └── robot/              # Robot conversion (GMR wrapper)
│       └── retargeter.py
│
├── scripts/                # CLI scripts
│   ├── run_pipeline.py     # Full pipeline
│   ├── generate_video.py   # Veo video generation
│   ├── extract_pose.py     # Pose extraction
│   └── convert_to_robot.py # Robot conversion
│   └── visualize.py        # Result visualization
│
├── configs/                # Configuration files
├── data/                   # Data (gitignored)
│
└── third_party/            # External dependencies (submodules)
    ├── PromptHMR/          # Pose extraction model
    └── GMR/                # Motion retargeting

Installation

This project requires two conda environments: gmr and phmr.

# Clone repo (with submodules)
git clone --recursive https://github.com/AIM-Intelligence/video2robot.git
cd video2robot

# Or initialize submodules after cloning
git submodule update --init --recursive

1. GMR Environment (Robot Retargeting)

conda create -n gmr python=3.10 -y
conda activate gmr
pip install -e .

For details, see GMR README.

2. PromptHMR Environment (Pose Extraction)

For Blackwell GPU (sm_120) users:

conda create -n phmr python=3.11 -y
conda activate phmr
cd third_party/PromptHMR
bash scripts/install_blackwell.sh

For other GPUs (Ampere, Hopper, etc.):

conda create -n phmr python=3.10 -y
conda activate phmr
cd third_party/PromptHMR
pip install -e .

For details, see PromptHMR README.

Usage

Note: Scripts automatically switch to the appropriate conda environment (gmr or phmr) as needed. Just ensure both environments are installed - no need to manually activate them.

# Full pipeline (action → robot motion) - BASE_PROMPT auto-applied
python scripts/run_pipeline.py --action "Action sequence:
The subject walks forward with four steps."

# Use Sora
python scripts/run_pipeline.py --action "..." --provider sora

# Start from existing video (video.mp4 → robot motion)
python scripts/run_pipeline.py --video /path/to/video.mp4

# Resume from existing project
python scripts/run_pipeline.py --project data/video_001

# Run individual steps
python scripts/generate_video.py --action "Action sequence: The subject walks forward."
python scripts/extract_pose.py --project data/video_001
python scripts/convert_to_robot.py --project data/video_001

# Visualization (auto env switching)
python scripts/visualize.py --project data/video_001
python scripts/visualize.py --project data/video_001 --pose
python scripts/visualize.py --project data/video_001 --robot

Web UI

# Run server (from video2robot root)
uvicorn web.app:app --host 0.0.0.0 --port 8000

# Access in browser
# http://localhost:8000

Features:

Prompt input → Video generation → Pose extraction → Robot conversion automatic pipeline
Video upload support
Veo/Sora model selection
3D visualization (viser)
Video-3D synchronized playback

Environment Setup

# Create .env file
cp .env.example .env

# Set API key
echo "GOOGLE_API_KEY=your-api-key" >> .env

Supported Robots

Robot	ID	DOF
Unitree G1	`unitree_g1`	29
Unitree H1	`unitree_h1`	19
Booster T1	`booster_t1`	23

See GMR README for full list

Output Format

# robot_motion.pkl
{
    "fps": 30.0,
    "robot_type": "unitree_g1",
    "num_frames": 240,
    "root_pos": np.ndarray,    # (N, 3)
    "root_rot": np.ndarray,    # (N, 4) quaternion xyzw
    "dof_pos": np.ndarray,     # (N, DOF)
}

TODO

lastFrame (Start/End Frame Interpolation) - Veo 3.1 only
- Start image + End image → Generate video smoothly connecting the two
- Useful for "Pose A → Pose B" robot motion videos
referenceImages (Reference Images) - Veo 3.1 only
- Up to 3 reference images to maintain character/style
- Generate videos with specific character performing actions

Acknowledgements

This project builds upon the following excellent open source projects:

PromptHMR: 3D human mesh recovery from video
GMR: general motion retargeting framework

License

This project depends on third-party libraries with their own licenses:

GMR: MIT License
PromptHMR: Non-Commercial Scientific Research Use Only

Please review both licenses before use.

The core video2robot code is MIT-licensed, but using this repository end-to-end (including PromptHMR) inherits PromptHMR's Non-Commercial Scientific Research Only restriction. Commercial use requires obtaining appropriate permission from the PromptHMR authors.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
configs		configs
data		data
scripts		scripts
third_party		third_party
video2robot		video2robot
web		web
.env.example		.env.example
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

video2robot

Demo

Pipeline

Project Structure

Installation

1. GMR Environment (Robot Retargeting)

2. PromptHMR Environment (Pose Extraction)

Usage

Web UI

Environment Setup

Supported Robots

Output Format

TODO

Acknowledgements

License

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

AIM-Intelligence/video2robot

Folders and files

Latest commit

History

Repository files navigation

video2robot

Demo

Pipeline

Project Structure

Installation

1. GMR Environment (Robot Retargeting)

2. PromptHMR Environment (Pose Extraction)

Usage

Web UI

Environment Setup

Supported Robots

Output Format

TODO

Acknowledgements

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages