Flow of Reasoning: Training LLMs for Divergent Reasoning with Minimal Examples

Official code for "Flow of Reasoning:Training LLMs for Divergent Reasoning with Minimal Examples" Also check our [Project Page]

Training & Inference

Our FoR formulates multi-step reasoning tasks as a flow:

Design reward $R(s_n)$ of terminal states for different tasks.
Collect trajectories with the local search technique.
Training LLM policy $P_{F}$ with trajectory balance loss.

Code

1) Download this GitHub

git clone https://github.com/Yu-Fangxu/FoR.git

2) Prepare the environment

We recommend conda for setting up a reproducible experiment environment. We include environment.yaml for creating a working environment:

bash install.sh

3) Choose 1 of 6 tasks to run

cd BlocksWorld|Game24|prontoqa|1D-ARC|Rubik's_Cube|GSM8K

Check more detailed instructions in each branch.

Citation

@inproceedings{yuflow,
  title={Flow of Reasoning: Training LLMs for Divergent Reasoning with Minimal Examples},
  author={Yu, Fangxu and Jiang, Lai and Kang, Haoqiang and Hao, Shibo and Qin, Lianhui},
  booktitle={Forty-second International Conference on Machine Learning}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Flow of Reasoning: Training LLMs for Divergent Reasoning with Minimal Examples

Training & Inference

Code

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 4

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 163 Commits
1D-ARC		1D-ARC
BlocksWorld		BlocksWorld
GSM8K		GSM8K
Game24		Game24
Rubik's_Cube		Rubik's_Cube
images		images
prontoqa		prontoqa
.DS_Store		.DS_Store
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
environment.yaml		environment.yaml
install.sh		install.sh

License

Yu-Fangxu/FoR

Folders and files

Latest commit

History

Repository files navigation

Flow of Reasoning: Training LLMs for Divergent Reasoning with Minimal Examples

Training & Inference

Code

Citation

About

Resources

License

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 4

Uh oh!

Languages

Packages