Skip to content

An intelligent traffic signal control system using Deep Reinforcement Learning to optimize traffic flow and reduce vehicle waiting times. This project implements multiple RL approaches from scratch, including Q-Learning, Deep Q-Network (DQN), and CNN-based DQN with visual state representation.

License

Notifications You must be signed in to change notification settings

MinhMarks/TrafficLightControl_ReinforcementLearning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

1 Commit
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🚦 Traffic Light Control using Reinforcement Learning

Python TensorFlow SUMO License

An intelligent traffic signal control system using Deep Reinforcement Learning to optimize traffic flow and reduce vehicle waiting times. This project implements multiple RL approaches from scratch, including Q-Learning, Deep Q-Network (DQN), and CNN-based DQN with visual state representation.

🎯 Project Highlights

  • Built from scratch: All algorithms implemented from the ground up without relying on RL frameworks
  • Custom SUMO network: Traffic network designed and created manually using SUMO netedit
  • Multiple RL approaches: Comparison between Fixed Timing, Q-Learning, and Deep Q-Learning
  • Visual state representation: CNN-based DQN using simulation screenshots as state input
  • Experience Replay & Target Network: Advanced DQN techniques for stable training
  • Comprehensive evaluation: Metrics include cumulative reward, average delay, and queue length

πŸ“Š Implemented Algorithms

Algorithm State Representation Description
Fixed Timing (Baseline) - Traditional fixed-cycle traffic control
Q-Learning Discrete (queue lengths + phase) Tabular RL with epsilon-greedy exploration
Deep Q-Network (DQN) Visual (128x128x4 frames) CNN-based function approximation
DQN with Target Network Visual (128x128x4 frames) Improved stability with target network

πŸ—οΈ Project Architecture

traffic-rl-control/
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ agents/
β”‚   β”‚   β”œβ”€β”€ fixed_timing.py      # Fixed timing baseline
β”‚   β”‚   β”œβ”€β”€ q_learning.py        # Tabular Q-Learning agent
β”‚   β”‚   └── dqn.py               # Deep Q-Network agent (CNN-based)
β”‚   β”œβ”€β”€ environment/             # SUMO environment wrapper
β”‚   └── utils/
β”‚       └── visualization.py     # Plotting utilities
β”œβ”€β”€ config/
β”‚   └── sumo/
β”‚       β”œβ”€β”€ network.net.xml      # Traffic network definition
β”‚       β”œβ”€β”€ routes.rou.xml       # Vehicle routes
β”‚       β”œβ”€β”€ detectors.add.xml    # Lane area detectors (E2)
β”‚       └── simulation.sumocfg   # SUMO configuration
β”œβ”€β”€ scripts/
β”‚   β”œβ”€β”€ run_experiments.py       # Run all experiments
β”‚   └── evaluate_model.py        # Evaluate trained models
β”œβ”€β”€ docs/
β”‚   β”œβ”€β”€ ARCHITECTURE.md          # System architecture
β”‚   └── GETTING_STARTED.md       # Setup guide
β”œβ”€β”€ model/                       # Saved DQN model checkpoints (200 epochs)
β”œβ”€β”€ result/                      # Training logs and metrics
β”œβ”€β”€ saved_plots/                 # Generated visualizations
└── README.md

🧠 Technical Details

Deep Q-Network Architecture

Input: 128x128x4 (4 stacked grayscale frames)
    ↓
Conv2D(32, 8x8, stride=4) + ReLU
    ↓
Conv2D(64, 4x4, stride=2) + ReLU
    ↓
Flatten
    ↓
Dense(512) + ReLU
    ↓
Dense(512) + ReLU
    ↓
Dense(2) - Q-values for [Keep Phase, Switch Phase]

State Representation

Visual State (DQN):

  • Screenshot of SUMO GUI captured at each step
  • Preprocessed: RGB β†’ Grayscale β†’ Resize to 128x128
  • Frame stacking: 4 consecutive frames for temporal information

Discrete State (Q-Learning):

  • Queue lengths from 6 lane area detectors
  • Current traffic light phase index

Reward Function

reward = previous_avg_delay - current_avg_delay

The agent is rewarded for reducing the average cumulative waiting time of vehicles.

Hyperparameters

Parameter Value Description
Learning Rate (Ξ±) 0.00001 Adam optimizer learning rate
Discount Factor (Ξ³) 0.99 Future reward discount
Exploration Rate (Ξ΅) 0.1 β†’ 0.01 Epsilon decay for exploration
Replay Buffer Size 2000 Experience replay memory
Batch Size 32 Mini-batch for training
Target Update Freq 10 Steps between target network updates

πŸš€ Getting Started

Prerequisites

  1. Python 3.8+
  2. SUMO (Simulation of Urban MObility) - Installation Guide
  3. Required Python packages:
pip install tensorflow numpy matplotlib pandas pillow traci

Environment Setup

# Set SUMO_HOME environment variable
# Windows
set SUMO_HOME=C:\Program Files (x86)\Eclipse\Sumo

# Linux/Mac
export SUMO_HOME=/usr/share/sumo

Running the Experiments

# Run Fixed Timing Baseline
python src/agents/baseline.py

# Run Q-Learning Agent
python src/agents/q_learning.py

# Run Deep Q-Network Agent
python src/agents/dqn_agent.py

πŸ“ˆ Results

Performance Comparison

Method Avg. Delay Reduction Queue Length Training Time
Fixed Timing Baseline ~290 vehicles -
Q-Learning ~15% ~250 vehicles ~30 min
DQN ~25% ~220 vehicles ~2 hours

Training Curves

The training process shows:

  • Cumulative Reward: Increasing trend indicating learning progress
  • Average Delay: Decreasing trend showing traffic optimization
  • Queue Length: Reduction in vehicle accumulation at intersections

πŸ”§ Key Features

1. Custom Traffic Network

  • Designed intersection layout using SUMO netedit
  • Configured realistic traffic flow patterns
  • Added lane area detectors (E2) for queue measurement

2. Flexible Action Space

  • Action 0: Keep current phase (maintain green light direction)
  • Action 1: Switch to next phase (change traffic direction)
  • Yellow light transition handled automatically (4 seconds)

3. Training Modes

  • Online Training: Learn while interacting with environment
  • Experience Replay: Sample from replay buffer for stable learning
  • Evaluation Mode: Test trained model without exploration

πŸ“ File Descriptions

Source Code (Refactored)

File Description
src/agents/fixed_timing.py Fixed Timing baseline implementation
src/agents/q_learning.py Q-Learning with tabular state representation
src/agents/dqn.py Deep Q-Network with CNN architecture
src/utils/visualization.py Visualization and result plotting

Original Implementation Files

File Description
traci5.FT.py Fixed Timing baseline (original)
traci6.QL.py Q-Learning implementation (original)
traci7.DQL.py Deep Q-Network implementation (original)
Phuoc_ne.py DQN with Target Network
policy_modify_action.py Policy-based action modification
baseline.py Baseline comparison experiments

πŸŽ“ Skills Demonstrated

  • Reinforcement Learning: Q-Learning, Deep Q-Learning, Experience Replay, Target Networks
  • Deep Learning: CNN architecture design, TensorFlow/Keras implementation
  • Traffic Simulation: SUMO configuration, TraCI API integration
  • Software Engineering: Modular code design, experiment tracking, visualization
  • Research: Algorithm comparison, hyperparameter tuning, performance analysis

πŸ“š References

  1. Mnih, V., et al. (2015). Human-level control through deep reinforcement learning. Nature.
  2. SUMO Documentation: https://sumo.dlr.de/docs/
  3. TraCI Python API: https://sumo.dlr.de/docs/TraCI/Interfacing_TraCI_from_Python.html

πŸ‘€ Author

[Your Name] - AI/ML Engineer

πŸŽ“ University of Information Technology (UIT) - Vietnam National University HCMC

πŸ“š Course: Artificial Intelligence (CS106)

πŸ“§ Email: [your.email@example.com]

πŸ”— LinkedIn: linkedin.com/in/yourprofile

πŸ’» GitHub: github.com/yourusername

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.


⭐ If you find this project helpful, please give it a star!

About

An intelligent traffic signal control system using Deep Reinforcement Learning to optimize traffic flow and reduce vehicle waiting times. This project implements multiple RL approaches from scratch, including Q-Learning, Deep Q-Network (DQN), and CNN-based DQN with visual state representation.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages