-
Notifications
You must be signed in to change notification settings - Fork 108
Description
Hi @FlashVSR-Authors,
First of all, thank you for open-sourcing such an impressive work! The quality and speed of FlashVSR are truly remarkable.
I have been working on an enhanced implementation based on this repository to make it more deployment-friendly and feature-rich for production use. I've created FlashVSR-Pro, and I'd like to share it with the community for anyone who might need these features.
Repository: https://github.com/LujiaJin/FlashVSR-Pro
Key Features & Enhancements:
- 🐳 One-Click Docker Setup: A fully configured Dockerfile that automates the environment setup, specifically handling the complex compilation of the Block-Sparse-Attention backend.
- 💾 Low VRAM Support: Implemented Tiled Inference for both DiT and VAE. This allows high-resolution inference on consumer GPUs (e.g., RTX 4090/3090) without OOM.
- 🧩 Unified Inference Script: Replaced separate scripts (
full,tiny,tiny-long) with a single, robustinfer.pythat handles all modes. - 🎵 Audio Preservation: Automatically transfers the audio track from the input video to the output.
- ⚡ Hardware Acceleration: Added NVENC hardware video encoding support and a Zero-Copy pipeline, significantly reducing CPU load and I/O bottlenecks during saving.
- 📏 Precision Alignment: Fixed frame count and resolution mismatches (smart padding) to ensure Output Duration == Input Duration exactly.
I hope this "Pro" version can help users who are struggling with environment setup (especially the CUDA kernels) or VRAM limitations.
Feel free to check it out or reference it if you find it useful!
Acknowledgement:
Special thanks to @lihaoyun6 and their work on FlashVSR_plus, which provided valuable inspiration for the Audio Preservation and Tiled DiT Inference features implemented in this Pro version.
Best regards,
LujiaJin