OmicsBench: The First Reasoning Benchmark for Multi-omics Sequences

OmicsBench is a pioneering benchmark designed to evaluate the scientific reasoning capabilities of Large Language Models (LLMs) in the context of multi-omics sequence analysis. Unlike traditional benchmarks that focus on black-box classification and regression metrics, OmicsBench requires models to provide traceable evidence chains, bridging the gap between prediction and genuine biological understanding.

🚀 Overview

Multi-omics sequences (DNA, RNA, proteins) encode complex biological mechanisms essential for understanding disease, designing therapeutics, and automated scientific discovery. While LLMs have shown promise in these areas, existing evaluations often fail to distinguish between shortcut learning (statistical pattern matching) and true scientific reasoning.

OmicsBench addresses this by:

Comprising 1,160 expert-validated questions.
Covering six biologically coherent tasks.
Spanning the central dogma: DNA regulation, RNA processing, and Protein function.
Evaluating traceable reasoning chains using instance-specific rubrics.

🧬 Tasks

OmicsBench is organized along the sequential logic of multi-omics information processing:

1. DNA Regulation

Identifying epigenetic marks
Promoter region analysis
Transcription factor binding sites

2. RNA Processing

Characterizing RNA modifications
Non-coding RNA analysis

3. Protein Function

Annotating enzyme functions

🛠️ Methodology

To ensure high-quality and scalable reasoning traces, OmicsBench utilizes a multi-agent synthesis framework. Tool-augmented bio-agents query biological databases, perform sequence alignments, and retrieve literature evidence to automatically curate reasoning chains.

All questions and solutions undergo a rigorous two-tier validation process:

Machine-based checks.
Expert reviews.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
.github/workflows		.github/workflows
data		data
README.md		README.md
eval.py		eval.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OmicsBench: The First Reasoning Benchmark for Multi-omics Sequences

🚀 Overview

🧬 Tasks

1. DNA Regulation

2. RNA Processing

3. Protein Function

🛠️ Methodology

About

Uh oh!

Releases

Packages

Languages

wanggzf/OmicsBench

Folders and files

Latest commit

History

Repository files navigation

OmicsBench: The First Reasoning Benchmark for Multi-omics Sequences

🚀 Overview

🧬 Tasks

1. DNA Regulation

2. RNA Processing

3. Protein Function

🛠️ Methodology

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages