Skip to content

Conversation

@Jiaxuan-Sun
Copy link

1. Dataset Module Refactoring (lightrft/datasets/)

Modified:

  • __init__.py: Refactored imports with unified interfaces and improved optional dependency handling

Added:

  • config.py: DatasetConfig class

    • Unified configuration for train/eval/pretrain datasets
    • Auto-normalization of data_path and data_probs (supports string/list)
    • Factory methods: for_train(), for_eval(), for_pretrain()
    • Parameter validation
  • loader.py: DatasetLoader class

    • Unified loading interface for train/eval/pretrain datasets
    • Automatic handling of blending_datasets parameters
    • Support for PromptDatasetVL and SFTDatasetVL
    • Consistent logging

2. Reward Module (lightrft/reward/)

Added:

  • __init__.py: Module entry point with unified exports

  • base.py: BaseReward abstract base class

    • Unified compute() method signature
    • Consistent return format: (rewards, metrics)
  • rule.py: RuleReward class

    • Rule-based reward implementation
    • Format checking (e.g., <think> tags, \boxed{} notation)
    • Accuracy verification using mathruler grader
    • Registry pattern for custom rule types
    • Built-in rules: default, geo3k_*, gsm8k_*
  • model.py: Reward model implementations

    • SingleRewardModel: Single reward model wrapper with auto load/offload
    • MultiRewardModel: Multiple reward model ensemble with recipe-based aggregation
    • Supports standard PyTorch models and custom engines (e.g., SGLang)
  • manager.py: RewardManager class

    • Unified manager for all reward types
    • Auto-selection of reward implementation (rule/single/multi)
    • from_config() factory method

@puyuan1996 puyuan1996 added the refactor Cleanup, formatting, or restructuring of existing code. label Jan 4, 2026
SingleRewardModel: Wrapper for single reward model
MultiRewardModel: Ensemble of multiple reward models with aggregation
Author: lightrft Team
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove the Author line.

Classes:
DatasetConfig: Configuration class for dataset loading
DatasetLoader: Unified loader for all dataset types
"""
Copy link
Collaborator

@puyuan1996 puyuan1996 Jan 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to update the examples in gsm8k and geo3k (examples/gsm8k_geo3k/train_colocate.py) to use the refactored dataset and reward APIs introduced here.

@@ -1,21 +1,54 @@
"""
Dataset Module for LightRLHF
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LightRFT

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

refactor Cleanup, formatting, or restructuring of existing code.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants