Action2vec guided Policy Optimization

About The Project

Here you can find the code of my final project for the lecture Advanced Topics in Deep Reinforcement Learning

Visualization of the Approach

Built With

CleanRL

Getting Started

Installation

Download the project:

 git clone git@github.com:Peer222/adrl_project.git
 cd adrl_project

Conda

Create a conda environment:

conda create -n adrl_project python=3.10
conda activate adrl_project
pip install -r requirements.txt

Data

The expert dataset of the AdroitHandDoor-v1 gymnasium-robotics environment from Minari is used.

The dataset is downloaded automatically by minari when action_model/train.py is executed. If no internet connection is available during execution, the dataset can also be pre-downloaded with the following command:

minari download door-expert-v2

Usage

First you should log in your wandb account by running

wandb login

You can also run all scripts without wandb tracking by adding the option --no-track

You can run experiments using the action sequence model:

python action_model/train.py --wandb_entity `your_wandb_entity` --wandb_project_name `your_wandb_project`

You can also run the SAC extensions and baseline by e.g.:

python algorithms/sac_ca_extended_obs.py --action_model_dir `path/to/saved/action_models` --wandb_entity `your_wandb_entity` --wandb_project_name `your_wandb_project`

To get more information run the scripts with the --help flag.

If you want to reproduce the results, you can execute the following bash scripts (with slurm):

sbatch scripts/action_model_experiment.sh `your_wandb_entity` `your_wandb_project` `offline or online`
sbatch scripts/sac_baseline_experiment.sh `your_wandb_entity` `your_wandb_project` `offline or online`
sbatch scripts/sac_extended_obs_experiment.sh `your_wandb_entity` `your_wandb_project` `offline or online`
sbatch scripts/sac_latent_actions_experiment.sh `your_wandb_entity` `your_wandb_project` `offline or online`
sbatch scripts/sac_latent_actions_extended_experiment.sh `your_wandb_entity` `your_wandb_project` `offline or online`

If wandb mode is set to offline, you have to manually upload the run statistics to the wandb server:

wandb sync `offline-run-identifier`

To download the data again and to store it as csv files, run:

./scripts/get_wandb_action_run_data.sh `your_wandb_entity` `your_wandb_project`
./scripts/get_wandb_run_data.sh `your_wandb_entity` `your_wandb_project`

You can create the plots by running:

./scripts/plot_action_model.sh
./scripts/plot_algorithm_multi.sh
./scripts/plot_algorithm_single.sh

Name		Name	Last commit message	Last commit date
Latest commit History 894 Commits
action_model		action_model
algorithms		algorithms
scripts		scripts
utils		utils
.gitignore		.gitignore
README.md		README.md
complete_model.png		complete_model.png
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Action2vec guided Policy Optimization

About The Project

Visualization of the Approach

Built With

CleanRL

Getting Started

Installation

Conda

Data

Usage

About

Uh oh!

Releases

Packages

Languages

Peer222/adrl_project

Folders and files

Latest commit

History

Repository files navigation

Action2vec guided Policy Optimization

About The Project

Visualization of the Approach

Built With

CleanRL

Getting Started

Installation

Conda

Data

Usage

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages