Skip to content

Peer222/adrl_project

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

894 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation


Logo

Action2vec guided Policy Optimization

About The Project

Here you can find the code of my final project for the lecture Advanced Topics in Deep Reinforcement Learning

Visualization of the Approach

model structure

Built With

gymnasium-robotics minari

Getting Started

Installation

Download the project:

 git clone git@github.com:Peer222/adrl_project.git
 cd adrl_project

Conda

Create a conda environment:

conda create -n adrl_project python=3.10
conda activate adrl_project
pip install -r requirements.txt

Data

The expert dataset of the AdroitHandDoor-v1 gymnasium-robotics environment from Minari is used.

The dataset is downloaded automatically by minari when action_model/train.py is executed. If no internet connection is available during execution, the dataset can also be pre-downloaded with the following command:

minari download door-expert-v2

Usage

First you should log in your wandb account by running

wandb login

You can also run all scripts without wandb tracking by adding the option --no-track

You can run experiments using the action sequence model:

python action_model/train.py --wandb_entity `your_wandb_entity` --wandb_project_name `your_wandb_project`

You can also run the SAC extensions and baseline by e.g.:

python algorithms/sac_ca_extended_obs.py --action_model_dir `path/to/saved/action_models` --wandb_entity `your_wandb_entity` --wandb_project_name `your_wandb_project`

To get more information run the scripts with the --help flag.

If you want to reproduce the results, you can execute the following bash scripts (with slurm):

sbatch scripts/action_model_experiment.sh `your_wandb_entity` `your_wandb_project` `offline or online`
sbatch scripts/sac_baseline_experiment.sh `your_wandb_entity` `your_wandb_project` `offline or online`
sbatch scripts/sac_extended_obs_experiment.sh `your_wandb_entity` `your_wandb_project` `offline or online`
sbatch scripts/sac_latent_actions_experiment.sh `your_wandb_entity` `your_wandb_project` `offline or online`
sbatch scripts/sac_latent_actions_extended_experiment.sh `your_wandb_entity` `your_wandb_project` `offline or online`

If wandb mode is set to offline, you have to manually upload the run statistics to the wandb server:

wandb sync `offline-run-identifier`

To download the data again and to store it as csv files, run:

./scripts/get_wandb_action_run_data.sh `your_wandb_entity` `your_wandb_project`
./scripts/get_wandb_run_data.sh `your_wandb_entity` `your_wandb_project`

You can create the plots by running:

./scripts/plot_action_model.sh
./scripts/plot_algorithm_multi.sh
./scripts/plot_algorithm_single.sh

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 91.0%
  • Shell 9.0%