Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
54 changes: 54 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@ _Our method introduces a novel differentiable mesh extraction framework that ope

- ⬛ Implement a simple training viewer using the <a href="https://github.com/graphdeco-inria/graphdecoviewer">GraphDeco viewer</a>.
- ⬛ Add the mesh-based rendering evaluation scripts in `./milo/eval/mesh_nvs`.
- ✅ Add DTU training and evaluation scripts.
- ✅ Add low-res and very-low-res training for light output meshes (under 50MB and under 20MB).
- ✅ Add T&T evaluation scripts in `./milo/eval/tnt/`.
- ✅ Add Blender add-on (for mesh-based editing and animation) to the repo.
Expand Down Expand Up @@ -244,6 +245,9 @@ with `--sampling_factor 0.1`, for instance.

Please refer to the <a href="https://depth-anything-v2.github.io/">DepthAnythingV2</a> repo to download the `vitl` checkpoint required for Depth-Order regularization. Then, move the checkpoint file to `./submodules/Depth-Anything-V2/checkpoints/`.

You can also use the `train_regular_densification.py` script instead of `train.py` to replace the fast densification from Mini-Splatting2 with a more traditional densification strategy for Gaussians, as used in [Gaussian Opacity Fields](https://github.com/autonomousvision/gaussian-opacity-fields/tree/main) and [RaDe-GS](https://baowenz.github.io/radegs/).
By default, this script will use its own config file `--mesh_config default_regular_densification`.

### Example Commands

Basic training for indoor scenes with logging:
Expand Down Expand Up @@ -271,6 +275,11 @@ Training with depth-order regularization:
python train.py -s <PATH TO COLMAP DATASET> -m <OUTPUT_DIR> --imp_metric indoor --rasterizer radegs --depth_order --depth_order_config strong --log_interval 200 --data_device cpu
```

Training with a traditional, slower densification strategy for Gaussians:
```bash
python train_regular_densification.py -s <PATH TO COLMAP DATASET> -m <OUTPUT_DIR> --imp_metric indoor --rasterizer radegs --log_interval 200 --data_device cpu
```

</details>

## 3. Extracting a Mesh after Optimization
Expand Down Expand Up @@ -329,6 +338,19 @@ python mesh_extract_integration.py \

The mesh will be saved at either `<MODEL_DIR>/mesh_integration_sdf.ply` or `<MODEL_DIR>/mesh_depth_fusion_sdf.ply` depending on the SDF computation method.

### 3.3. Use regular, non-scalable TSDF

We also propose a script to extract a mesh using traditional TSDF on a regular voxel grid. This script is heavily inspired from the awesome work [2D Gaussian Splatting](https://github.com/hbb1/2d-gaussian-splatting). This mesh extraction process does not scale to unbounded real scenes with background geometry.
```bash
python mesh_extract_regular_tsdf.py \
-s <PATH TO COLMAP DATASET> \
-m <MODEL DIR> \
--rasterizer radegs \
--mesh_res 1024
```

The mesh will be saved at `<MODEL_DIR>/mesh_regular_tsdf_res<MESH RES>.ply`. A cleaned version of the mesh will be saved at `<MODEL_DIR>/mesh_regular_tsdf_res<MESH RES>_post.ply`, following 2DGS's postprocessing.

</details>

## 4. Using our differentiable Gaussians-to-Mesh pipeline in your own 3DGS project
Expand Down Expand Up @@ -562,6 +584,8 @@ If you get artifacts in the rendering, you can try to play with the various foll
<summary>Click here to see content.</summary>
<br>

### Tanks and Temples

For evaluation, please start by downloading [our COLMAP runs for the Tanks and Temples dataset](https://drive.google.com/drive/folders/1Bf7DM2DFtQe4J63bEFLceEycNf4qTcqm?usp=sharing), and make sure to move all COLMAP scene directories (Barn, Caterpillar, _etc._) inside the same directory.

Then, please download ground truth point cloud, camera poses, alignments and cropfiles from [Tanks and Temples dataset](https://www.tanksandtemples.org/download/). The ground truth dataset should be organized as:
Expand Down Expand Up @@ -637,6 +661,36 @@ python render.py \
python metrics.py -m <path to trained model> # Compute error metrics on renderings
```

### DTU

MILo is designed for maximum scalability to allow for the reconstruction of full scenes, including background elements. We optimized our method and hyperparameters to strike a balance between performance and scalability.

However, we also evaluate MILo on small object-centric scenes from the DTU dataset, to verify that our mesh-in-the-loop regularization does not hurt performance in highly controlled scenarios.

For these smaller scenes, the aggressive densification strategy from Mini-Splatting2 is unnecessary. Instead, we use the traditional progressive densification strategy proposed in [GOF](https://github.com/autonomousvision/gaussian-opacity-fields/tree/main) and [RaDe-GS](https://baowenz.github.io/radegs/), which is better suited for highly controlled scenarios.

Similarly, since DTU scans focus on small objects of interest without background reconstruction, we employ a regular grid for mesh extraction after training (similar to GOF and RaDe-GS) rather than our scalable extraction method.

We use the preprocessed DTU dataset from [2D GS](https://github.com/hbb1/2d-gaussian-splatting) for training. Please refer to the corresponding repo for downloading instructions.
Evaluation scripts are adapted from [GOF](https://github.com/autonomousvision/gaussian-opacity-fields/tree/main) and [RaDe-GS](https://baowenz.github.io/radegs/).

Please run the following commands to evaluate MILo on a single DTU scan:
```bash
# Training with regular densification
python train_regular_densification.py -s <PATH TO DTU SCAN> -m <OUTPUT DIRECTORY> -r 2 --rasterizer radegs --imp_metric indoor --mesh_config default_dtu --decoupled_appearance --log_interval 200

# Mesh extraction
python ./eval/dtu/mesh_extract_dtu.py -s <PATH TO DTU SCAN> -m <OUTPUT DIRECTORY> -r 2 --rasterizer radegs

# Evaluation
python ./eval/dtu/evaluate_dtu_mesh.py -s <PATH TO DTU SCAN> -m <OUTPUT DIRECTORY> -r 2 --DTU <PATH TO GT DTU DATA> --scan_id <SCAN ID>
```

You can also run the following scripts for evaluating on the full DTU dataset:
```bash
python scripts/evaluate_dtu.py --data_dir <PATH TO DIRECTORY CONTAINING DTU COLMAP SCENES> --gt_dir <PATH TO GT DTU DATA> --rasterizer radegs --log_interval 200
```

</details>

## 8. Acknowledgements
Expand Down
62 changes: 62 additions & 0 deletions milo/configs/mesh/default_dtu.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
# Regularization schedule
start_iter: 20_001
mesh_update_interval: 1
stop_iter: 30_000

# Depth loss weight
use_depth_loss: true
depth_weight: 0.01 # 0.05
depth_ratio: 1.0 # 0.6
mesh_depth_loss_type: "log" # "log"

# Normal loss weight
use_normal_loss: true # true
normal_weight: 0.01 # 0.05
use_depth_normal: true # true

# Delaunay computation
delaunay_reset_interval: 500 # 500
n_max_points_in_delaunay: 5_400_000 # -1 for no limit
delaunay_sampling_method: "surface" # "random", "surface" or "surface+opacity"
filter_large_edges: true # true or false?
collapse_large_edges: false # true or false?

# Rasterization
use_scalable_renderer: false

# SDF computation
sdf_reset_interval: 500 # 500
sdf_default_isosurface: 0.5 # 0.5
transform_sdf_to_linear_space: false # SHOULD BE FALSE
min_occupancy_value: 0.0000000001 # 0.0000000001

# > For Integrate
use_ema: true # true
alpha_ema: 0.4 # 0.4

# > For TSDF
trunc_margin: 0.002 # 0.002

# > For learnable
occupancy_mode: "occupancy_shift" # "occupancy_shift" or "density_shift"
# > Gaussian centers regularization
enforce_occupied_centers: true # true
occupied_centers_weight: 0.001 # 0.005
# > Occupancy labels loss
use_occupancy_labels_loss: true # true or false?
reset_occupancy_labels_every: 200 # 200
occupancy_labels_loss_weight: 0.001 # 0.005
# > SDF reset
fix_set_of_learnable_sdfs: true # false or true?
learnable_sdf_reset_mode: "ema" # "none" or "ema"
learnable_sdf_reset_stop_iter: 25_001
learnable_sdf_reset_alpha_ema: 0.4 # 0.4

method_to_reset_sdf: "depth_fusion" # "integration" or "depth_fusion"
n_binary_steps_to_reset_sdf: 0 # Use 0 to disable binary search
sdf_reset_linearization_n_steps: 20
sdf_reset_linearization_enforce_std: 0.5 # 0.5
depth_fusion_reset_tolerance: 0.1

# Foreground Culling
radius_culling: -1.0 # -1.0 for no culling
167 changes: 167 additions & 0 deletions milo/eval/dtu/eval.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,167 @@
# adapted from https://github.com/jzhangbs/DTUeval-python
import numpy as np
import open3d as o3d
import sklearn.neighbors as skln
from tqdm import tqdm
from scipy.io import loadmat
import multiprocessing as mp
import argparse

def sample_single_tri(input_):
n1, n2, v1, v2, tri_vert = input_
c = np.mgrid[:n1+1, :n2+1]
c += 0.5
c[0] /= max(n1, 1e-7)
c[1] /= max(n2, 1e-7)
c = np.transpose(c, (1,2,0))
k = c[c.sum(axis=-1) < 1] # m2
q = v1 * k[:,:1] + v2 * k[:,1:] + tri_vert
return q

def write_vis_pcd(file, points, colors):
pcd = o3d.geometry.PointCloud()
pcd.points = o3d.utility.Vector3dVector(points)
pcd.colors = o3d.utility.Vector3dVector(colors)
o3d.io.write_point_cloud(file, pcd)

if __name__ == '__main__':
mp.freeze_support()

parser = argparse.ArgumentParser()
parser.add_argument('--data', type=str, default='data_in.ply')
parser.add_argument('--scan', type=int, default=1)
parser.add_argument('--mode', type=str, default='mesh', choices=['mesh', 'pcd'])
parser.add_argument('--dataset_dir', type=str, default='.')
parser.add_argument('--vis_out_dir', type=str, default='.')
parser.add_argument('--downsample_density', type=float, default=0.2)
parser.add_argument('--patch_size', type=float, default=60)
parser.add_argument('--max_dist', type=float, default=20)
parser.add_argument('--visualize_threshold', type=float, default=10)
args = parser.parse_args()

thresh = args.downsample_density
if args.mode == 'mesh':
pbar = tqdm(total=9)
pbar.set_description('read data mesh')
data_mesh = o3d.io.read_triangle_mesh(args.data)

vertices = np.asarray(data_mesh.vertices)
triangles = np.asarray(data_mesh.triangles)
tri_vert = vertices[triangles]

pbar.update(1)
pbar.set_description('sample pcd from mesh')
v1 = tri_vert[:,1] - tri_vert[:,0]
v2 = tri_vert[:,2] - tri_vert[:,0]
l1 = np.linalg.norm(v1, axis=-1, keepdims=True)
l2 = np.linalg.norm(v2, axis=-1, keepdims=True)
area2 = np.linalg.norm(np.cross(v1, v2), axis=-1, keepdims=True)
non_zero_area = (area2 > 0)[:,0]
l1, l2, area2, v1, v2, tri_vert = [
arr[non_zero_area] for arr in [l1, l2, area2, v1, v2, tri_vert]
]
thr = thresh * np.sqrt(l1 * l2 / area2)
n1 = np.floor(l1 / thr)
n2 = np.floor(l2 / thr)

with mp.Pool() as mp_pool:
new_pts = mp_pool.map(sample_single_tri, ((n1[i,0], n2[i,0], v1[i:i+1], v2[i:i+1], tri_vert[i:i+1,0]) for i in range(len(n1))), chunksize=1024)

new_pts = np.concatenate(new_pts, axis=0)
data_pcd = np.concatenate([vertices, new_pts], axis=0)

elif args.mode == 'pcd':
pbar = tqdm(total=8)
pbar.set_description('read data pcd')
data_pcd_o3d = o3d.io.read_point_cloud(args.data)
data_pcd = np.asarray(data_pcd_o3d.points)

pbar.update(1)
pbar.set_description('random shuffle pcd index')
shuffle_rng = np.random.default_rng()
shuffle_rng.shuffle(data_pcd, axis=0)

pbar.update(1)
pbar.set_description('downsample pcd')
nn_engine = skln.NearestNeighbors(n_neighbors=1, radius=thresh, algorithm='kd_tree', n_jobs=-1)
nn_engine.fit(data_pcd)
rnn_idxs = nn_engine.radius_neighbors(data_pcd, radius=thresh, return_distance=False)
mask = np.ones(data_pcd.shape[0], dtype=np.bool_)
for curr, idxs in enumerate(rnn_idxs):
if mask[curr]:
mask[idxs] = 0
mask[curr] = 1
data_down = data_pcd[mask]

pbar.update(1)
pbar.set_description('masking data pcd')
obs_mask_file = loadmat(f'{args.dataset_dir}/ObsMask/ObsMask{args.scan}_10.mat')
ObsMask, BB, Res = [obs_mask_file[attr] for attr in ['ObsMask', 'BB', 'Res']]
BB = BB.astype(np.float32)

patch = args.patch_size
inbound = ((data_down >= BB[:1]-patch) & (data_down < BB[1:]+patch*2)).sum(axis=-1) ==3
data_in = data_down[inbound]

data_grid = np.around((data_in - BB[:1]) / Res).astype(np.int32)
grid_inbound = ((data_grid >= 0) & (data_grid < np.expand_dims(ObsMask.shape, 0))).sum(axis=-1) ==3
data_grid_in = data_grid[grid_inbound]
in_obs = ObsMask[data_grid_in[:,0], data_grid_in[:,1], data_grid_in[:,2]].astype(np.bool_)
data_in_obs = data_in[grid_inbound][in_obs]

pbar.update(1)
pbar.set_description('read STL pcd')
stl_pcd = o3d.io.read_point_cloud(f'{args.dataset_dir}/Points/stl/stl{args.scan:03}_total.ply')
stl = np.asarray(stl_pcd.points)

pbar.update(1)
pbar.set_description('compute data2stl')
nn_engine.fit(stl)
dist_d2s, idx_d2s = nn_engine.kneighbors(data_in_obs, n_neighbors=1, return_distance=True)
max_dist = args.max_dist
mean_d2s = dist_d2s[dist_d2s < max_dist].mean()

pbar.update(1)
pbar.set_description('compute stl2data')
ground_plane = loadmat(f'{args.dataset_dir}/ObsMask/Plane{args.scan}.mat')['P']

stl_hom = np.concatenate([stl, np.ones_like(stl[:,:1])], -1)
above = (ground_plane.reshape((1,4)) * stl_hom).sum(-1) > 0
stl_above = stl[above]

nn_engine.fit(data_in)
dist_s2d, idx_s2d = nn_engine.kneighbors(stl_above, n_neighbors=1, return_distance=True)
mean_s2d = dist_s2d[dist_s2d < max_dist].mean()

pbar.update(1)
pbar.set_description('visualize error')
vis_dist = args.visualize_threshold
R = np.array([[1,0,0]], dtype=np.float64)
G = np.array([[0,1,0]], dtype=np.float64)
B = np.array([[0,0,1]], dtype=np.float64)
W = np.array([[1,1,1]], dtype=np.float64)
data_color = np.tile(B, (data_down.shape[0], 1))
data_alpha = dist_d2s.clip(max=vis_dist) / vis_dist
data_color[ np.where(inbound)[0][grid_inbound][in_obs] ] = R * data_alpha + W * (1-data_alpha)
data_color[ np.where(inbound)[0][grid_inbound][in_obs][dist_d2s[:,0] >= max_dist] ] = G
write_vis_pcd(f'{args.vis_out_dir}/vis_{args.scan:03}_d2s.ply', data_down, data_color)
stl_color = np.tile(B, (stl.shape[0], 1))
stl_alpha = dist_s2d.clip(max=vis_dist) / vis_dist
stl_color[ np.where(above)[0] ] = R * stl_alpha + W * (1-stl_alpha)
stl_color[ np.where(above)[0][dist_s2d[:,0] >= max_dist] ] = G
write_vis_pcd(f'{args.vis_out_dir}/vis_{args.scan:03}_s2d.ply', stl, stl_color)

pbar.update(1)
pbar.set_description('done')
pbar.close()
over_all = (mean_d2s + mean_s2d) / 2
print(mean_d2s, mean_s2d, over_all)

import json
with open(f'{args.vis_out_dir}/results.json', 'w') as fp:
json.dump({
'mean_d2s': mean_d2s,
'mean_s2d': mean_s2d,
'overall': over_all,
}, fp, indent=True)

Loading