Skip to content

Conversation

@Edmund-a7
Copy link

Brief:

  • Add H and W parameters in vggt.py、eval_utils.py、aggregator.py、attention.py,enabling the model to automatically handle image sizes.
  • Add inference.py

Hi! I've made a couple of improvements to the project.

Problem:

  1. The repository was missing an inference.py script to run the model on custom datasets.
  2. In the attention mechanism, the token height and width were hardcoded. This isn't ideal, as they should adapt to the processed image dimensions.

Solution:

  1. I've added a new inference.py script.
  2. I've modified the attention code to automatically calculate the tokens based on the image size (specifically, W / 14 and H / 14 in attention.py:161).

I have tested the code successfully. This makes the model more robust and easier to use for inference. Please let me know if you have any feedback!

mystorm16 and others added 4 commits September 5, 2025 14:25
- Add H and W in vggt.py、eval_utils.py、aggregator.py、attention.py,enabling the model to automatically handle image sizes.
- Add inference.py
@mystorm16
Copy link
Owner

Hi @Edmund-a7, thank you for your work on FastVGGT. We’ve updated the evaluation for the custom dataset, including the related point cloud and pose visualizations. Following your suggestion, we also modified the hardcoded height and width. Please give it a try!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants