Skip to content

wanggzf/OmicsBench

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 

Repository files navigation

OmicsBench: The First Reasoning Benchmark for Multi-omics Sequences

OmicsBench is a pioneering benchmark designed to evaluate the scientific reasoning capabilities of Large Language Models (LLMs) in the context of multi-omics sequence analysis. Unlike traditional benchmarks that focus on black-box classification and regression metrics, OmicsBench requires models to provide traceable evidence chains, bridging the gap between prediction and genuine biological understanding.

🚀 Overview

Multi-omics sequences (DNA, RNA, proteins) encode complex biological mechanisms essential for understanding disease, designing therapeutics, and automated scientific discovery. While LLMs have shown promise in these areas, existing evaluations often fail to distinguish between shortcut learning (statistical pattern matching) and true scientific reasoning.

OmicsBench addresses this by:

  • Comprising 1,160 expert-validated questions.
  • Covering six biologically coherent tasks.
  • Spanning the central dogma: DNA regulation, RNA processing, and Protein function.
  • Evaluating traceable reasoning chains using instance-specific rubrics.

🧬 Tasks

OmicsBench is organized along the sequential logic of multi-omics information processing:

1. DNA Regulation

  • Identifying epigenetic marks
  • Promoter region analysis
  • Transcription factor binding sites

2. RNA Processing

  • Characterizing RNA modifications
  • Non-coding RNA analysis

3. Protein Function

  • Annotating enzyme functions

🛠️ Methodology

To ensure high-quality and scalable reasoning traces, OmicsBench utilizes a multi-agent synthesis framework. Tool-augmented bio-agents query biological databases, perform sequence alignments, and retrieve literature evidence to automatically curate reasoning chains.

All questions and solutions undergo a rigorous two-tier validation process:

  1. Machine-based checks.
  2. Expert reviews.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%