Skip to content

Popular repositories Loading

  1. Cherry_LLM Cherry_LLM Public

    [NAACL'24] Self-data filtering of LLM instruction-tuning data using a novel perplexity-based difficulty score, without using any other models

    Python 416 27

  2. Reflection_Tuning Reflection_Tuning Public

    [ACL'24] Selective Reflection-Tuning: Student-Selected Data Recycling for LLM Instruction-Tuning

    Python 366 30

  3. HallusionBench HallusionBench Public

    [CVPR'24] HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Models

    Python 325 9

  4. Superfiltering Superfiltering Public

    [ACL'24] Superfiltering: Weak-to-Strong Data Filtering for Fast Instruction-Tuning

    Python 188 16

  5. MoE-Embedding MoE-Embedding Public

    [ICLR 2025 Oral] "Your Mixture-of-Experts LLM Is Secretly an Embedding Model For Free"

    Python 90 11

  6. MiP-Overthinking MiP-Overthinking Public

    [COLM'25] Missing Premise exacerbates Overthinking: Are Reasoning Models losing Critical Thinking Skill?

    Python 37 1

Repositories

Showing 10 of 22 repositories
  • FaSTAR Public

    [ICLR 2026] Fast-Slow Toolpath Agent with Subroutine Mining for Efficient Multi-turn Image Editing

    tianyi-lab/FaSTAR’s past year of commit activity
    Jupyter Notebook 29 BSD-3-Clause 2 0 0 Updated Feb 6, 2026
  • TSRBench Public

    TSRBench: A Comprehensive Multi-task Multi-modal Time Series Reasoning Benchmark for Generalist Models

    tianyi-lab/TSRBench’s past year of commit activity
    Python 11 0 0 0 Updated Jan 30, 2026
  • VREX Public

    V-REX: Benchmarking Exploratory Visual Reasoning via Chain-of-Questions

    tianyi-lab/VREX’s past year of commit activity
    Python 7 MIT 0 0 0 Updated Dec 15, 2025
  • RoMA Public

    Code for "Routing Manifold Alignment Improves Generalization of Mixture-of-Experts LLMs"

    tianyi-lab/RoMA’s past year of commit activity
    Jupyter Notebook 16 MIT 3 2 0 Updated Nov 6, 2025
  • ChartAlignBench Public

    Code for "ChartAB: A Benchmark for Chart Grounding & Dense Alignment"

    tianyi-lab/ChartAlignBench’s past year of commit activity
    Jupyter Notebook 5 Apache-2.0 1 0 0 Updated Nov 4, 2025
  • HallusionBench Public

    [CVPR'24] HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Models

    tianyi-lab/HallusionBench’s past year of commit activity
    Python 325 BSD-3-Clause 9 1 0 Updated Oct 14, 2025
  • RuleR Public

    [NAACL'25] RuleR: Improving LLM Controllability by Rule-based Data Recycling

    tianyi-lab/RuleR’s past year of commit activity
    Python 14 1 1 0 Updated Sep 27, 2025
  • Mosaic-IT Public

    [ACL'25] Mosaic-IT: Cost-Free Compositional Data Synthesis for Instruction Tuning

    tianyi-lab/Mosaic-IT’s past year of commit activity
    Python 20 4 0 0 Updated Sep 27, 2025
  • ColorBench Public

    [NeurIPS'25] ColorBench: Can VLMs See and Understand the Colorful World? A Comprehensive Benchmark for Color Perception, Reasoning, and Robustness

    tianyi-lab/ColorBench’s past year of commit activity
    Python 30 Apache-2.0 0 0 0 Updated Sep 27, 2025
  • DisCL Public

    [ICCV 2025] Diffusion Curriculum (DisCL)

    tianyi-lab/DisCL’s past year of commit activity
    Jupyter Notebook 17 0 3 0 Updated Sep 26, 2025

Most used topics

Loading…