Linfei Li · Lin Zhang* · Zhong Wang · Ying Shen
Table of Contents
The simplest way to install all dependences is to use anaconda and pip in the following steps:
conda create -n gs3lam python==3.10
conda activate gs3lam
conda install -c "nvidia/label/cuda-11.7.0" cuda-toolkit
pip install torch==1.13.1+cu117 torchvision==0.14.1+cu117 --extra-index-url https://download.pytorch.org/whl/cu117
pip install -r requirements.txt
# install Gaussian Rasterization
pip install submodules/gaussian-semantic-rasterizationDATAROOT is ./data by default. Please change the basedir path in the scene-specific config files if datasets are stored somewhere else on your machine.
The original Replica dataset does not contain semantic labels. We obtained semantic labels from vMAP. You can download our generated semantic Replica dataset from here, then place the data into the ./data/Replica folder.
Note, if you directly use the Replica dataset provided by vMAP, please modify the Replica Dataloader and the png_depth_scale parameter in config files.
The TUM-RGBD dataset does not have ground truth semantic labels, so it is not our evaluation dataset. However, in order to evalute the effectiveness of GS3LAM, we use pseudo-semantic labels generated by DEVA, which you can download from here. Unfortunately, existing semantic segmentation models struggle to maintain inter-frame semantic consistency in long sequence data, so we only tested on the freiburg1_desk sequence.
Please follow the data downloading procedure on the ScanNet website, and extract color/depth frames from the .sens file using this code.
[Directory structure of ScanNet (click to expand)]
DATAROOT
└── scannet
└── scene0000_00
└── frames
├── color
│ ├── 0.jpg
│ ├── 1.jpg
│ └── ...
├── depth
│ ├── 0.png
│ ├── 1.png
│ └── ...
├── label-filt
│ ├── 0.png
│ ├── 1.png
│ └── ...
├── intrinsic
└── pose
├── 0.txt
├── 1.txt
└── ...
We use the following sequences:
scene0000_00
scene0059_00
scene0106_00
scene0169_00
scene0181_00
scene0207_00
To run GS3LAM on the freiburg1_desk scene, run the following command:
python run.py configs/Tum/tum_fr1.pyTo run GS3LAM on the office0 scene, run the following command:
python run.py configs/Replica/office0.pyTo run GS3LAM on all Replica scenes, run the following command:
bash scripts/eval_full_replica.shTo run GS3LAM on the scene0059_00 scene, run the following command:
python run.py configs/Scannet/scene0059_00.pyTo run GS3LAM on all ScanNet scenes, run the following command:
bash scripts/eval_full_scannet.bash- Define the
SEEDandSCENE_NUMenvironment variables in the configuration file.
# ``SEED`` is the random seed used during training, which should be consistent with the configuration.
export SEED=1
# ``SCENE_NUM`` is the index of the data sequence in the following list.
# Replica: ["room0", "room1", "room2","office0", "office1", "office2", "office3", "office4"]
# Scannet: ["scene0059_00", "scene0106_00", "scene0169_00", "scene0181_00", "scene0207_00", "scene0000_00"]
export SCENE_NUM=0- Online reconstruction.
# optional mode: [color, depth, centers, sem, sem_color, sem_feature]
python visualizer/online_recon.py --mode color --logdir path/to/the/log- Offline reconstruction.
# optional mode: [color, depth, centers, sem, sem_color, sem_feature]
python visualizer/offline_recon.py --mode sem_color --logdir path/to/the/log- Export Mesh
# optional mode: [color, sem]
python visualizer/export_mesh.py --mode color --logdir path/to/the/logTo draw Fig. 2 in the paper, which can demonstrate the relationship between optimization iterations, rendering quality and camera trajectories.
python visualizer/plot_opt_bias.py --logdir path/to/the/logWe thank the authors of the following repositories for their open-source code:
As our work heavily relies on SplaTAM, we kindly ask that you adhere to the guidelines set forth in SplaTAM's LICENSE.
If you find our paper and code useful for your research, please use the following BibTeX entry.
@inproceedings{li2024gs3lam,
author = {Li, Linfei and Zhang, Lin and Wang, Zhong and Shen, Ying},
title = {GS3LAM: Gaussian Semantic Splatting SLAM},
year = {2024},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
booktitle = {Proceedings of the 32nd ACM International Conference on Multimedia},
pages = {3019–3027},
numpages = {9},
location = {Melbourne VIC, Australia},
series = {MM '24}
}