- This work builds upon the previously published RoCo paper, attempting to use diffusion models to address noise issues (pose errors and time delays) in collaborative perception. It was conducted in July 2024.
-
Documenting the training process to avoid forgetting:
Training Process Documentation -
The main modified files are:
point_pillar_baseline_multiscale.pydiffusion_fuse.py(which implements using single-vehicle features as conditions to guide the diffusion process for generating ensemble features)
-
Due to varying feature map size requirements across datasets, the code is quite messy... only I can understand it.
-
Nevertheless, the results are promising - even outperforming RoCo. However, diffusion-generated features perform poorly in high-noise scenarios, likely because the model learns from suboptimal samples, leading to inferior outputs.

