RL Sensitivity Analysis

Overview

This repository contains the implementation, analysis, and report for our Reinforcement Learning (RL) Sensitivity Analysis project. The project's focus is solving the MountainCarContinuous-v0 environment from the Gymnasium and performing sensitivity analyses on different hyperparameters. We implemented several RL techniques and enhancements to address the challenges of continuous control.

Assignment Details

Part 1: Environment Analysis
A detailed description of the MountainCarContinuous-v0 environment, including its continuous action space, observation structure (position and velocity), reward signals, and dynamics.
Part 2: DQN Implementation
Implementation of the Deep Q-Network (DQN) algorithm using PyTorch. Custom environment wrappers were created to adapt the continuous action space for DQN by discretizing actions and normalizing observations.
Part 3: Sensitivity Analysis
Investigation of key hyperparameters (learning rate, exploration decay, batch size, discretization bins, network architecture, etc.) and their effect on training stability and performance.
Part 4: Advanced DQN Improvements
Enhance basic DQN with techniques such as Prioritized Experience Replay (PER), Double DQN, Dueling Architecture, and Noisy Networks to improve learning stability and efficiency.
Part 5: Neural Network Structure Exploration
Experimentation with different neural network architectures (LSTM, GRU, and Transformer) integrated within the DQN framework to assess their impact on learning in a sequential decision-making environment.
Part 6: PPO and A2C Comparison
Implementation and evaluation of policy gradient methods like Proximal Policy Optimization (PPO) and Synchronous Advantage Actor-Critic (A2C) for a comparative study against the value-based methods.

Environment: MountainCarContinuous-v0

Action Space:
Continuous scalar value in the range [-1.0, 1.0] representing the force applied to the car.
Observation Space:
A 2-dimensional vector representing:
- Position: Car's horizontal location.
- Velocity: Rate of change of position.
Reward Signal:
A negative penalty for large actions combined with a large positive bonus when the goal is reached.

How to Run

Installation:
Ensure you have Python 3.7+ installed.
Running the Notebooks:
Since the project uses Jupyter Notebooks for running experiments and analysis, follow these steps:

Open a terminal or command prompt.
Navigate to the project directory.
Launch Jupyter Notebook or Jupyter Lab.
Open and run the following notebooks:
- Assignment RL 2025 (DQN Analysis).ipynb
- PPO_A2C_Analysis.ipynb
Follow the instructions within the notebooks to execute the cells and view the analysis.

Results:
Training logs, plots, and performance metrics will be saved in the results/ folder and inside the Jupyter notebooks. Detailed analysis can be found in the report.

Acknowledgements

Authors:

Kostoulas Evangelos

Kotsopoulos Dimitris

Triantafyllou Thanasis

License

This project is licensed under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
DQN		DQN
PPO-A2C		PPO-A2C
.gitattributes		.gitattributes
.gitignore		.gitignore
Assignment RL 2024-2025 (DQN Analysis).ipynb		Assignment RL 2024-2025 (DQN Analysis).ipynb
LICENSE		LICENSE
README.md		README.md
RL Assignment.pdf		RL Assignment.pdf
git		git

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RL Sensitivity Analysis

Overview

Assignment Details

Environment: MountainCarContinuous-v0

How to Run

Acknowledgements

Authors:

License

About

Uh oh!

Releases

Packages

Languages

License

evankost/rl-sensitivity-analysis

Folders and files

Latest commit

History

Repository files navigation

RL Sensitivity Analysis

Overview

Assignment Details

Environment: MountainCarContinuous-v0

How to Run

Acknowledgements

Authors:

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages