CoDiff: Conditional Diffusion Model for Collaborative 3D Object Detection

This work builds upon the previously published RoCo paper, attempting to use diffusion models to address noise issues (pose errors and time delays) in collaborative perception. It was conducted in July 2024.

Documenting the training process to avoid forgetting:
Training Process Documentation
The main modified files are:
- point_pillar_baseline_multiscale.py
- diffusion_fuse.py (which implements using single-vehicle features as conditions to guide the diffusion process for generating ensemble features)
Due to varying feature map size requirements across datasets, the code is quite messy... only I can understand it.
Nevertheless, the results are promising - even outperforming RoCo. However, diffusion-generated features perform poorly in high-noise scenarios, likely because the model learns from suboptimal samples, leading to inferior outputs.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
fuse_modules		fuse_modules
models		models
README.md		README.md
name: dairv2x_CoDiff_New.yaml		name: dairv2x_CoDiff_New.yaml

Provide feedback