diff --git a/README.md b/README.md index b9224b0..0d4cb54 100644 --- a/README.md +++ b/README.md @@ -1,5 +1,5 @@
- Logo + Logo
# SPEC-RL: Accelerating On-Policy Reinforcement Learning via Speculative Rollouts @@ -15,9 +15,9 @@ -**_¹ [Bingshuai Liu](https://bingshuailiu.github.io), [Ante Wang](), ¹ [Zijun Min](),_** +**_¹ [Bingshuai Liu](https://bingshuailiu.github.io), ¹ Ante Wang, ¹ Zijun Min,_** -**_² [Liang Yao](), ² [Haibo Zhang](), ² [Anxiang Zeng](), ¹ *[Jinsong Su]()_** +**_² Liang Yao, ² Haibo Zhang, ² Anxiang Zeng, ¹ *Jinsong Su_** diff --git a/requirements-npu.txt b/requirements-npu.txt index 7d03869..78d204c 100644 --- a/requirements-npu.txt +++ b/requirements-npu.txt @@ -10,7 +10,7 @@ peft pyarrow>=15.0.0 pybind11 pylatexenc -tensordict>=0.8.0,<=0.9.1,!=0.9.0 +tensordict>=0.8.0,!=0.9.0,<=0.10.0 transformers==4.52.4 ray==2.46.0 wandb diff --git a/requirements.txt b/requirements.txt index 31459e6..0b7d2d2 100644 --- a/requirements.txt +++ b/requirements.txt @@ -14,7 +14,7 @@ pybind11 pylatexenc pre-commit ray[default] -tensordict>=0.8.0,<=0.9.1,!=0.9.0 +tensordict>=0.8.0,!=0.9.0,<=0.10.0 torchdata transformers # vllm==0.8.4 diff --git a/requirements_sglang.txt b/requirements_sglang.txt index ce9e7d5..9b8749c 100644 --- a/requirements_sglang.txt +++ b/requirements_sglang.txt @@ -12,7 +12,7 @@ pyarrow>=19.0.0 pybind11 pylatexenc ray[default]>=2.10 -tensordict>=0.8.0,<=0.9.1,!=0.9.0 +tensordict>=0.8.0,!=0.9.0,<=0.10.0 torchdata torchvision transformers