Full-stack web application (React + Flask) for Multimodal Video Captioning. Deploys the MixCap model (BLIP-2 + Wav2Vec2) to generate video descriptions for end-users.
-
Updated
Jan 21, 2026 - JavaScript
Full-stack web application (React + Flask) for Multimodal Video Captioning. Deploys the MixCap model (BLIP-2 + Wav2Vec2) to generate video descriptions for end-users.
Add a description, image, and links to the mixcap topic page so that developers can more easily learn about it.
To associate your repository with the mixcap topic, visit your repo's landing page and select "manage topics."