Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
@@ -1 +1,2 @@
__pycache__
__pycache__
*.ply
70 changes: 70 additions & 0 deletions AGENTS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
# AGENTS

Quick orientation and cluster-specific setup for this `sam-3d-objects` fork.

## Repo overview
- Model: SAM 3D Objects (single image -> 3D geometry/texture/layout).
- Primary docs: `README.md`, `doc/setup.md`, `SAM3D_SETUP_NOTES.md`.
- Cluster helpers live in `repro/` (scripts for reproducible runs on this cluster).

## Cluster requirements
- Linux platform `linux-64`.
- NVIDIA GPU with >= 32 GB VRAM (A6000 preferred).
- Build/install on a GPU node to avoid PyTorch3D CPU-only builds.

## Recommended Slurm allocation
```
salloc -p a6000 --gres=gpu:1 --cpus-per-task=8 --mem=32G --time=02:00:00
srun --pty bash
```

## Environment setup (mamba)
```
cd /path/to/sam-3d-objects

mamba env create -f environments/default.yml
mamba activate sam3d-objects

export PIP_EXTRA_INDEX_URL="https://pypi.ngc.nvidia.com https://download.pytorch.org/whl/cu121"
pip install -e '.[dev]'
pip install -e '.[p3d]'

export PIP_FIND_LINKS="https://nvidia-kaolin.s3.us-east-2.amazonaws.com/torch-2.5.1_cu121.html"
pip install -e '.[inference]'

./patching/hydra
```

## Hugging Face checkpoints
Access is required for `facebook/sam-3d-objects`.
```
pip install 'huggingface-hub[cli]<1.0'
hf auth login

TAG=hf
hf download \
--repo-type model \
--local-dir checkpoints/${TAG}-download \
--max-workers 1 \
facebook/sam-3d-objects
mv checkpoints/${TAG}-download/checkpoints checkpoints/${TAG}
rm -rf checkpoints/${TAG}-download
```

## Sanity checks
```
nvidia-smi
mamba info | rg "platform|platforms"

python - <<'PY'
import torch
print("cuda:", torch.cuda.is_available())
if torch.cuda.is_available():
print(torch.cuda.get_device_name(0))
PY
```

## Quick run
```
python demo.py
```
34 changes: 34 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,40 @@ SAM 3D Objects is one part of SAM 3D, a pair of models for object and human mesh

Follow the [setup](doc/setup.md) steps before running the following.

## Slurm quickstart (cluster navigation)

This project is often run on a Slurm cluster. Here are the core concepts and the most common commands.

**Concepts**
- Controller: the login node where you run Slurm commands (`sinfo`, `squeue`).
- Node: a compute machine (e.g. `gpu01`); jobs run here.
- Partition: a queue of nodes with shared policies (e.g. `defq`, `a6000`).
- Job/step: a scheduled unit of work (`sbatch` for batch jobs, `srun` for steps).
- GRES/TRES: resource labels like GPUs (`gres/gpu=1`) and memory/CPU tracking.

**Find resources**
- Nodes and state: `sinfo -N -l`
- Node details (GPUs/CPU/RAM): `scontrol show node gpu01`
- Your jobs: `squeue -u $USER`
- Watch your queue: `watch -n 2 "squeue -u $USER -o '%.18i %.9P %.20j %.8T %.10M %.6D %R'"`

**Run work**
- Interactive shell on a node: `srun -N 1 -n 1 -c 4 --mem=16G --pty bash`
- Run a command on a specific node: `srun -w gpu01 hostname`
- Request GPUs (required for `nvidia-smi` to see devices):
`srun -w gpu01 --gres=gpu:1 nvidia-smi -L`
- Batch job (script):
`sbatch path/to/job.sh`

**Control jobs**
- Cancel job: `scancel <jobid>`
- Inspect job: `scontrol show job <jobid>`

**Resource flags (common)**
- CPUs: `-c 8` or `--cpus-per-task=8`
- Memory: `--mem=64G` or `--mem-per-cpu=4G`
- GPUs: `--gres=gpu:1` (or `--gpus-per-task=1` if configured)

## Single or Multi-Object 3D Generation

SAM 3D Objects can convert masked objects in an image, into 3D models with pose, shape, texture, and layout. SAM 3D is designed to be robust in challenging natural images, handling small objects and occlusions, unusual poses, and difficult situations encountered in uncurated natural scenes like this kidsroom:
Expand Down
93 changes: 93 additions & 0 deletions SAM3D_SETUP_NOTES.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
# SAM 3D Objects - Cluster Setup Notes

This document captures the findings and a complete setup flow for `sam-3d-objects` on this Slurm cluster using mamba.

## Repository Location

- Repo path: `$REPO_ROOT`

## Prerequisites (from `doc/setup.md`)

- Linux 64-bit (mamba platform `linux-64`).
- NVIDIA GPU with at least 32 GB VRAM.
- Build on a GPU node to avoid PyTorch3D "Not compiled with GPU support" errors.

## Slurm Findings

Partitions observed:

- `defq` (nodes `gpu01-08`)
- `a6000` (node `gpu09`)

GPU resources for `a6000`:

- `gpu09` has `gres=gpu:4` and is in partition `a6000`.
- Use this partition to satisfy the >= 32 GB VRAM requirement (A6000 is typically 48 GB).

## Recommended Interactive Allocation

```
salloc -p a6000 --gres=gpu:1 --cpus-per-task=8 --mem=32G --time=02:00:00
srun --pty bash
```

## Environment Setup (mamba)

From `doc/setup.md`:

```
cd $REPO_ROOT

mamba env create -f environments/default.yml
mamba activate sam3d-objects

export PIP_EXTRA_INDEX_URL="https://pypi.ngc.nvidia.com https://download.pytorch.org/whl/cu121"
pip install -e '.[dev]'
pip install -e '.[p3d]'

export PIP_FIND_LINKS="https://nvidia-kaolin.s3.us-east-2.amazonaws.com/torch-2.5.1_cu121.html"
pip install -e '.[inference]'

./patching/hydra
```

## GPU and Platform Verification

```
nvidia-smi
mamba info | rg "platform|platforms"
```

Expected:

- GPU present and visible in `nvidia-smi`.
- `platform : linux-64` in `mamba info`.

## Hugging Face Checkpoints

Access required for `facebook/sam-3d-objects`.

```
pip install 'huggingface-hub[cli]<1.0'
hf auth login

TAG=hf
hf download \
--repo-type model \
--local-dir checkpoints/${TAG}-download \
--max-workers 1 \
facebook/sam-3d-objects
mv checkpoints/${TAG}-download/checkpoints checkpoints/${TAG}
rm -rf checkpoints/${TAG}-download
```

## Sanity Check (CUDA)

```
python - <<'PY'
import torch
print("cuda:", torch.cuda.is_available())
if torch.cuda.is_available():
print(torch.cuda.get_device_name(0))
PY
```
1 change: 1 addition & 0 deletions demo.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
# Copyright (c) Meta Platforms, Inc. and affiliates.
import sys

# import inference code
Expand Down
3 changes: 3 additions & 0 deletions download_model.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
from huggingface_hub import hf_hub_download

path = hf_hub_download("facebook/sam-3d-objects", "pipeline.yaml")
35 changes: 33 additions & 2 deletions notebook/demo_3db_mesh_alignment.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,37 @@
"os.makedirs(output_dir, exist_ok=True)\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 0. Inference and Save SAM 3D Objects"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Please inference SAM 3D Objects Repo with https://github.com/facebookresearch/sam-3d-objects/blob/main/notebook/demo_multi_object.ipynb\n",
"# The above notebook will apply the generated layout to the generated objects, and same them as ply. \n",
"# Then, this cell will load the posed SAM 3D Objects and transform them into the OpenGL coordinate system, which is the same system as SAM 3D Body. \n",
"import numpy as np\n",
"import open3d as o3d\n",
"\n",
"# Load PLY file\n",
"input_path = 'gaussians/human_object_posed.ply'\n",
"output_path = 'meshes/human_object/3Dfy_results/0.ply'\n",
"mesh = o3d.io.read_point_cloud(input_path)\n",
"points = np.asarray(mesh.points)\n",
"\n",
"# Transform to OpenGL coordinate system. \n",
"points[:, [0, 2]] *= -1 # flip x and z\n",
"mesh.points = o3d.utility.Vector3dVector(points)\n",
"o3d.io.write_point_cloud(output_path, mesh)"
]
},
{
"cell_type": "markdown",
"metadata": {},
Expand Down Expand Up @@ -115,7 +146,7 @@
"from mesh_alignment import visualize_meshes_interactive\n",
"\n",
"aligned_mesh_path = f\"{PATH}/meshes/human_object/aligned_meshes/human_aligned.ply\"\n",
"dfy_mesh_path = f\"{PATH}/meshes/human_object/3Dfy_results/0.glb\"\n",
"dfy_mesh_path = f\"{PATH}/meshes/human_object/3Dfy_results/0.ply\"\n",
"\n",
"demo, combined_glb_path = visualize_meshes_interactive(\n",
" aligned_mesh_path=aligned_mesh_path,\n",
Expand All @@ -127,7 +158,7 @@
],
"metadata": {
"kernelspec": {
"display_name": "sam3d_objects-3dfy",
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
Expand Down
6 changes: 4 additions & 2 deletions notebook/demo_multi_object.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -95,8 +95,10 @@
"outputs": [],
"source": [
"scene_gs = make_scene(*outputs)\n",
"scene_gs = ready_gaussian_for_video_rendering(scene_gs)\n",
"# export posed gaussian splatting (as point cloud)\n",
"scene_gs.save_ply(f\"{PATH}/gaussians/{IMAGE_NAME}_posed.ply\")\n",
"\n",
"scene_gs = ready_gaussian_for_video_rendering(scene_gs)\n",
"# export gaussian splatting (as point cloud)\n",
"scene_gs.save_ply(f\"{PATH}/gaussians/multi/{IMAGE_NAME}.ply\")\n",
"\n",
Expand Down Expand Up @@ -140,7 +142,7 @@
],
"metadata": {
"kernelspec": {
"display_name": "sam3d-objects",
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
Expand Down
Binary file modified notebook/images/human_object/image.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
41 changes: 35 additions & 6 deletions notebook/mesh_alignment.py
Original file line number Diff line number Diff line change
Expand Up @@ -298,7 +298,7 @@ def visualize_meshes_interactive(aligned_mesh_path, dfy_mesh_path, output_dir=No

Args:
aligned_mesh_path: Path to aligned mesh PLY file
dfy_mesh_path: Path to 3Dfy GLB file
dfy_mesh_path: Path to 3Dfy Object file
output_dir: Directory to save combined GLB file (defaults to same dir as aligned_mesh_path)
share: Whether to create a public shareable link (default: True)
height: Height of the 3D viewer in pixels (default: 600)
Expand All @@ -307,6 +307,7 @@ def visualize_meshes_interactive(aligned_mesh_path, dfy_mesh_path, output_dir=No
tuple: (demo, combined_glb_path) - Gradio demo object and path to combined GLB file
"""
import gradio as gr
import numpy as np

print("Loading meshes for interactive visualization...")

Expand All @@ -315,17 +316,17 @@ def visualize_meshes_interactive(aligned_mesh_path, dfy_mesh_path, output_dir=No
aligned_mesh = trimesh.load(aligned_mesh_path)
print(f"Loaded aligned mesh: {len(aligned_mesh.vertices)} vertices")

# Load 3Dfy mesh (GLB - handle scene structure)
# Load 3Dfy mesh (PLY)
dfy_scene = trimesh.load(dfy_mesh_path)

if hasattr(dfy_scene, 'dump'): # It's a scene
if hasattr(dfy_scene, 'dump'):
dfy_meshes = [geom for geom in dfy_scene.geometry.values() if hasattr(geom, 'vertices')]
if len(dfy_meshes) == 1:
dfy_mesh = dfy_meshes[0]
elif len(dfy_meshes) > 1:
dfy_mesh = trimesh.util.concatenate(dfy_meshes)
else:
raise ValueError("No valid meshes in GLB file")
raise ValueError("No valid meshes in PLY file")
else:
dfy_mesh = dfy_scene

Expand All @@ -348,14 +349,42 @@ def visualize_meshes_interactive(aligned_mesh_path, dfy_mesh_path, output_dir=No
output_dir = os.path.dirname(aligned_mesh_path)
os.makedirs(output_dir, exist_ok=True)

# Save combined PLY by concatenating both meshes
combined_ply_path = os.path.join(output_dir, 'combined_scene.ply')

# Ccombine the geometries for PLY output
if isinstance(dfy_mesh, trimesh.points.PointCloud):
# Convert point cloud to vertices-only mesh for combination
dfy_vertices = dfy_mesh.vertices
human_vertices = aligned_mesh.vertices

# Combine vertices from both
all_vertices = np.vstack([human_vertices, dfy_vertices])

# Create colors: red for human, blue for object
human_colors = np.array([[255, 0, 0, 200]] * len(human_vertices))
object_colors = np.array([[0, 0, 255, 200]] * len(dfy_vertices))
all_colors = np.vstack([human_colors, object_colors])

# Create combined point cloud
combined_cloud = trimesh.points.PointCloud(vertices=all_vertices, colors=all_colors)
combined_cloud.export(combined_ply_path)
else:
# Both are meshes, use scene export
scene.export(combined_ply_path)

print(f"Exported combined scene to: {combined_ply_path}")

# Also save GLB for Gradio viewer (NOTE: GLB may not show point cloud object properly)
combined_glb_path = os.path.join(output_dir, 'combined_scene.glb')
scene.export(combined_glb_path)
print(f"Exported combined scene to: {combined_glb_path}")
print(f"Exported GLB for Gradio viewer to: {combined_glb_path}")
print("NOTE: Use PLY for complete data, GLB is for Gradio visualization only")

# Create interactive Gradio viewer
with gr.Blocks() as demo:
gr.Markdown("# 3D Mesh Alignment Visualization")
gr.Markdown("**Red**: SAM 3D Body Aligned Human | **Blue**: 3Dfy Object")
gr.Markdown("**Red**: SAM 3D Body Aligned Human | **Blue**: SAM 3D Object")
gr.Model3D(
value=combined_glb_path,
label="Combined 3D Scene (Interactive)",
Expand Down
Loading