refactor(sunjx): refactor dataset and reward module #13

Jiaxuan-Sun · 2025-12-31T07:11:42Z

1. Dataset Module Refactoring (`lightrft/datasets/`)

Modified:

__init__.py: Refactored imports with unified interfaces and improved optional dependency handling

Added:

config.py: DatasetConfig class
- Unified configuration for train/eval/pretrain datasets
- Auto-normalization of data_path and data_probs (supports string/list)
- Factory methods: for_train(), for_eval(), for_pretrain()
- Parameter validation
loader.py: DatasetLoader class
- Unified loading interface for train/eval/pretrain datasets
- Automatic handling of blending_datasets parameters
- Support for PromptDatasetVL and SFTDatasetVL
- Consistent logging

2. Reward Module (`lightrft/reward/`)

Added:

__init__.py: Module entry point with unified exports
base.py: BaseReward abstract base class
- Unified compute() method signature
- Consistent return format: (rewards, metrics)
rule.py: RuleReward class
- Rule-based reward implementation
- Format checking (e.g., <think> tags, \boxed{} notation)
- Accuracy verification using mathruler grader
- Registry pattern for custom rule types
- Built-in rules: default, geo3k_*, gsm8k_*
model.py: Reward model implementations
- SingleRewardModel: Single reward model wrapper with auto load/offload
- MultiRewardModel: Multiple reward model ensemble with recipe-based aggregation
- Supports standard PyTorch models and custom engines (e.g., SGLang)
manager.py: RewardManager class
- Unified manager for all reward types
- Auto-selection of reward implementation (rule/single/multi)
- from_config() factory method

puyuan1996 · 2026-01-04T09:45:42Z

lightrft/reward/model.py

+    SingleRewardModel: Wrapper for single reward model
+    MultiRewardModel: Ensemble of multiple reward models with aggregation
+
+Author: lightrft Team


Please remove the Author line.

puyuan1996 · 2026-01-04T09:47:29Z

lightrft/datasets/__init__.py

+Classes:
+    DatasetConfig: Configuration class for dataset loading
+    DatasetLoader: Unified loader for all dataset types
+"""


We need to update the examples in gsm8k and geo3k (examples/gsm8k_geo3k/train_colocate.py) to use the refactored dataset and reward APIs introduced here.

puyuan1996 · 2026-01-04T09:47:39Z

lightrft/datasets/__init__.py

@@ -1,21 +1,54 @@
+"""
+Dataset Module for LightRLHF


Jiaxuan-Sun added 2 commits December 31, 2025 14:58

refactor(sunjx): refactor dataset and reward module

773a1ee

Remove unnecessary code

513789d

puyuan1996 added the refactor Cleanup, formatting, or restructuring of existing code. label Jan 4, 2026

puyuan1996 requested changes Jan 4, 2026

View reviewed changes

puyuan1996 mentioned this pull request Jan 5, 2026

Roadmap for LightRFT v0.1.1 #19

Open

refactor(sunjx): update geo3k to use refactored dataset and reward APIs

875600d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

refactor(sunjx): refactor dataset and reward module #13

refactor(sunjx): refactor dataset and reward module #13

Jiaxuan-Sun commented Dec 31, 2025

Uh oh!

puyuan1996 Jan 4, 2026

Uh oh!

puyuan1996 Jan 4, 2026 •

edited

Loading

Uh oh!

puyuan1996 Jan 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

refactor(sunjx): refactor dataset and reward module #13

Are you sure you want to change the base?

refactor(sunjx): refactor dataset and reward module #13

Conversation

Jiaxuan-Sun commented Dec 31, 2025

1. Dataset Module Refactoring (lightrft/datasets/)

2. Reward Module (lightrft/reward/)

Uh oh!

puyuan1996 Jan 4, 2026

Choose a reason for hiding this comment

Uh oh!

puyuan1996 Jan 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

puyuan1996 Jan 4, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

1. Dataset Module Refactoring (`lightrft/datasets/`)

2. Reward Module (`lightrft/reward/`)

puyuan1996 Jan 4, 2026 •

edited

Loading