Skip to content
Change the repository type filter

All

    Repositories list

    • AnyTouch2

      Public
      [ICLR 2026] AnyTouch 2: General Optical Tactile Representation Learning For Dynamic Tactile Perception
      Python
      21600Updated Feb 13, 2026Feb 13, 2026
    • GAP

      Public
      [ICLR 2026] When would Vision-Proprioception Policies Fail in Robotic Manipulation?
      Python
      0100Updated Feb 11, 2026Feb 11, 2026
    • A curated list of balanced multimodal learning methods.
      515920Updated Feb 3, 2026Feb 3, 2026
    • HTML
      1100Updated Jan 29, 2026Jan 29, 2026
    • MIBench

      Public
      0000Updated Jan 25, 2026Jan 25, 2026
    • LFAV

      Public
      Towards Long Form Audio-visual Video Understanding
      Python
      01510Updated Jan 16, 2026Jan 16, 2026
    • AnyTouch

      Public
      The repo for "AnyTouch: Learning Unified Static-Dynamic Representation across Multiple Visuo-tactile Sensors", ICLR 2025
      Python
      88320Updated Jan 13, 2026Jan 13, 2026
    • MokA

      Public
      MokA: Multimodal Low-Rank Adaptation for MLLMs
      Python
      480130Updated Dec 30, 2025Dec 30, 2025
    • Crab

      Public
      [CVPR 2025] Crab: A Unified Audio-Visual Scene Understanding Model with Explicit Cooperation
      Python
      28040Updated Dec 24, 2025Dec 24, 2025
    • This is the repo for "Adaptive Unimodal Regulation for Balanced Multimodal Information Acquisition", CVPR2025.
      Python
      52050Updated Dec 22, 2025Dec 22, 2025
    • JavaScript
      0000Updated Oct 26, 2025Oct 26, 2025
    • Python
      31210Updated Oct 26, 2025Oct 26, 2025
    • Ref-AVS

      Public
      The official repo for "Ref-AVS: Refer and Segment Objects in Audio-Visual Scenes", ECCV 2024
      Python
      25000Updated Oct 12, 2025Oct 12, 2025
    • JavaScript
      0000Updated Sep 25, 2025Sep 25, 2025
    • The repo for "Balanced Multimodal Learning via On-the-fly Gradient Modulation", CVPR 2022 (ORAL)
      Python
      23309360Updated Sep 22, 2025Sep 22, 2025
    • MGIPF

      Public
      The repo for "MGIPF: Multi-Granularity Interest Prediction Framework for Personalized Recommendation", SIGIR 2025
      Python
      1200Updated Jul 26, 2025Jul 26, 2025
    • WCAE

      Public
      Python
      0000Updated Jul 1, 2025Jul 1, 2025
    • MS-Bot

      Public
      The offical repo for "Play to the Score: Stage-Guided Dynamic Multi-Sensory Fusion for Robotic Manipulation", CoRL 2024 (ORAL)
      Python
      31910Updated Jun 25, 2025Jun 25, 2025
    • Official repo for ICML 2025 paper "RollingQ: Reviving the Cooperation Dynamics in Multimodal Transformer"
      Python
      21440Updated Jun 21, 2025Jun 21, 2025
    • A python implement for Certifiable Robust Multi-modal Training
      Python
      01900Updated Jun 21, 2025Jun 21, 2025
    • [CVPR2025] Code Release of Patch Matters: Training-free Fine-grained Image Caption Enhancement via Local Perception
      Python
      02020Updated Jun 17, 2025Jun 17, 2025
    • The official repo for "Efficient Quantification of Multimodal Interaction at Sample Level", ICML 2025
      Python
      1710Updated Jun 5, 2025Jun 5, 2025
    • Python
      01410Updated Apr 30, 2025Apr 30, 2025
    • The official repo for "Can Textual Semantics Mitigate Sounding Object Segmentation Preference?", ECCV 2024
      Python
      0610Updated Mar 1, 2025Mar 1, 2025
    • Python
      03940Updated Feb 23, 2025Feb 23, 2025
    • A curated list of audio-visual learning methods and datasets.
      2028510Updated Dec 3, 2024Dec 3, 2024
    • The repo for "Enhancing Multi-modal Cooperation via Sample-level Modality Valuation", CVPR 2024
      Python
      45970Updated Nov 5, 2024Nov 5, 2024
    • TSPM

      Public
      Official repository for "Boosting Audio Visual Question Answering via Key Semantic-Aware Cues" in ACM MM 2024.
      Python
      11650Updated Oct 25, 2024Oct 25, 2024
    • The repo for "KOI: Accelerating Online Imitation Learning via Hybrid Key-state Guidance", CoRL 2024
      Python
      1900Updated Oct 17, 2024Oct 17, 2024
    • The official repo for "Stepping Stones: A Progressive Training Strategy for Audio-Visual Semantic Segmentation", ECCV 2024
      Python
      21810Updated Oct 11, 2024Oct 11, 2024