Reconstruction Scale/Pose Mismatch: Predicted Poses and Intrinsics Not Aligning with Ground Truth

Hello, thank you for this great repository.
I’m tracking cameras and reconstructing a scene using this repository. I have ground truth poses and intrinsics for each image. When I run inference with only images, the point cloud looks reasonable but is incorrectly scaled. Adding intrinsics doesn’t change the result, but adding poses improves the scale—however, the point cloud becomes incoherent and unrecognizable. To find where this problem comes from, I looked at input/output difference in poses and intrinsics. 
I plotted the poses in 3D:
With images only: Point cloud is scaled incorrectly (4x too large). Predicted poses are in green, measured ones in blue.
 
<img width="400" height="359" alt="Image" src="https://github.com/user-attachments/assets/ef1e99b7-fb54-42e7-ae84-c717b7a7369f" />

With images plus intrinsics: Result is almost identical to image-only.

<img width="427" height="314" alt="Image" src="https://github.com/user-attachments/assets/0147e680-fab8-48d0-a43e-910bb50438ac" />
 
With images, intrinsics and poses: Scale improves, but is still incorrect, the point cloud is noisy and doesn’t match the scene, it seems like the pointclouds related to each input images duplicate the scanned object in different but nearby positions. My issue seems close to the #33 one.
 
<img width="393" height="340" alt="Image" src="https://github.com/user-attachments/assets/94e9557a-16d7-468f-a198-d80221e186e9" />

About intrinsics: 
The predicted intrinsics (fx, fy, cx, cy) are significantly different from the provided values. And this difference is not completely correlated to the reshape process of images (2048x1536) -> (518, 392) as seen in the issue #38 .
One of the differences is that input cx and cy are not centered while the predicted intrinsics are centered. I’ve seen a similar problem in issue #67 but with no solution for me.


Do somebody know where my errors come from? Is it normal to have predicted poses and intrinsics that differs that much from inputs? As somebody managed to have a satisfying result with poses and intrinsics as inputs?
Thank you for your help! 


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reconstruction Scale/Pose Mismatch: Predicted Poses and Intrinsics Not Aligning with Ground Truth #93

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Reconstruction Scale/Pose Mismatch: Predicted Poses and Intrinsics Not Aligning with Ground Truth #93

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions