diff --git a/README.md b/README.md index b9224b0..0d4cb54 100644 --- a/README.md +++ b/README.md @@ -1,5 +1,5 @@
- Logo + Logo
# SPEC-RL: Accelerating On-Policy Reinforcement Learning via Speculative Rollouts @@ -15,9 +15,9 @@ -**_¹ [Bingshuai Liu](https://bingshuailiu.github.io), [Ante Wang](), ¹ [Zijun Min](),_** +**_¹ [Bingshuai Liu](https://bingshuailiu.github.io), ¹ Ante Wang, ¹ Zijun Min,_** -**_² [Liang Yao](), ² [Haibo Zhang](), ² [Anxiang Zeng](), ¹ *[Jinsong Su]()_** +**_² Liang Yao, ² Haibo Zhang, ² Anxiang Zeng, ¹ *Jinsong Su_** diff --git a/requirements-npu.txt b/requirements-npu.txt index 7d03869..f08c738 100644 --- a/requirements-npu.txt +++ b/requirements-npu.txt @@ -11,7 +11,7 @@ pyarrow>=15.0.0 pybind11 pylatexenc tensordict>=0.8.0,<=0.9.1,!=0.9.0 -transformers==4.52.4 +transformers==4.56.1 ray==2.46.0 wandb mathruler