🚦 Traffic Light Control using Reinforcement Learning

An intelligent traffic signal control system using Deep Reinforcement Learning to optimize traffic flow and reduce vehicle waiting times. This project implements multiple RL approaches from scratch, including Q-Learning, Deep Q-Network (DQN), and CNN-based DQN with visual state representation.

🎯 Project Highlights

Built from scratch: All algorithms implemented from the ground up without relying on RL frameworks
Custom SUMO network: Traffic network designed and created manually using SUMO netedit
Multiple RL approaches: Comparison between Fixed Timing, Q-Learning, and Deep Q-Learning
Visual state representation: CNN-based DQN using simulation screenshots as state input
Experience Replay & Target Network: Advanced DQN techniques for stable training
Comprehensive evaluation: Metrics include cumulative reward, average delay, and queue length

📊 Implemented Algorithms

Algorithm	State Representation	Description
Fixed Timing (Baseline)	-	Traditional fixed-cycle traffic control
Q-Learning	Discrete (queue lengths + phase)	Tabular RL with epsilon-greedy exploration
Deep Q-Network (DQN)	Visual (128x128x4 frames)	CNN-based function approximation
DQN with Target Network	Visual (128x128x4 frames)	Improved stability with target network

🏗️ Project Architecture

traffic-rl-control/
├── src/
│   ├── agents/
│   │   ├── fixed_timing.py      # Fixed timing baseline
│   │   ├── q_learning.py        # Tabular Q-Learning agent
│   │   └── dqn.py               # Deep Q-Network agent (CNN-based)
│   ├── environment/             # SUMO environment wrapper
│   └── utils/
│       └── visualization.py     # Plotting utilities
├── config/
│   └── sumo/
│       ├── network.net.xml      # Traffic network definition
│       ├── routes.rou.xml       # Vehicle routes
│       ├── detectors.add.xml    # Lane area detectors (E2)
│       └── simulation.sumocfg   # SUMO configuration
├── scripts/
│   ├── run_experiments.py       # Run all experiments
│   └── evaluate_model.py        # Evaluate trained models
├── docs/
│   ├── ARCHITECTURE.md          # System architecture
│   └── GETTING_STARTED.md       # Setup guide
├── model/                       # Saved DQN model checkpoints (200 epochs)
├── result/                      # Training logs and metrics
├── saved_plots/                 # Generated visualizations
└── README.md

🧠 Technical Details

Deep Q-Network Architecture

Input: 128x128x4 (4 stacked grayscale frames)
    ↓
Conv2D(32, 8x8, stride=4) + ReLU
    ↓
Conv2D(64, 4x4, stride=2) + ReLU
    ↓
Flatten
    ↓
Dense(512) + ReLU
    ↓
Dense(512) + ReLU
    ↓
Dense(2) - Q-values for [Keep Phase, Switch Phase]

State Representation

Visual State (DQN):

Screenshot of SUMO GUI captured at each step
Preprocessed: RGB → Grayscale → Resize to 128x128
Frame stacking: 4 consecutive frames for temporal information

Discrete State (Q-Learning):

Queue lengths from 6 lane area detectors
Current traffic light phase index

Reward Function

reward = previous_avg_delay - current_avg_delay

The agent is rewarded for reducing the average cumulative waiting time of vehicles.

Hyperparameters

Parameter	Value	Description
Learning Rate (α)	0.00001	Adam optimizer learning rate
Discount Factor (γ)	0.99	Future reward discount
Exploration Rate (ε)	0.1 → 0.01	Epsilon decay for exploration
Replay Buffer Size	2000	Experience replay memory
Batch Size	32	Mini-batch for training
Target Update Freq	10	Steps between target network updates

🚀 Getting Started

Prerequisites

Python 3.8+
SUMO (Simulation of Urban MObility) - Installation Guide
Required Python packages:

pip install tensorflow numpy matplotlib pandas pillow traci

Environment Setup

# Set SUMO_HOME environment variable
# Windows
set SUMO_HOME=C:\Program Files (x86)\Eclipse\Sumo

# Linux/Mac
export SUMO_HOME=/usr/share/sumo

Running the Experiments

# Run Fixed Timing Baseline
python src/agents/baseline.py

# Run Q-Learning Agent
python src/agents/q_learning.py

# Run Deep Q-Network Agent
python src/agents/dqn_agent.py

📈 Results

Performance Comparison

Method	Avg. Delay Reduction	Queue Length	Training Time
Fixed Timing	Baseline	~290 vehicles	-
Q-Learning	~15%	~250 vehicles	~30 min
DQN	~25%	~220 vehicles	~2 hours

Training Curves

The training process shows:

Cumulative Reward: Increasing trend indicating learning progress
Average Delay: Decreasing trend showing traffic optimization
Queue Length: Reduction in vehicle accumulation at intersections

🔧 Key Features

1. Custom Traffic Network

Designed intersection layout using SUMO netedit
Configured realistic traffic flow patterns
Added lane area detectors (E2) for queue measurement

2. Flexible Action Space

Action 0: Keep current phase (maintain green light direction)
Action 1: Switch to next phase (change traffic direction)
Yellow light transition handled automatically (4 seconds)

3. Training Modes

Online Training: Learn while interacting with environment
Experience Replay: Sample from replay buffer for stable learning
Evaluation Mode: Test trained model without exploration

📁 File Descriptions

Source Code (Refactored)

File	Description
`src/agents/fixed_timing.py`	Fixed Timing baseline implementation
`src/agents/q_learning.py`	Q-Learning with tabular state representation
`src/agents/dqn.py`	Deep Q-Network with CNN architecture
`src/utils/visualization.py`	Visualization and result plotting

Original Implementation Files

File	Description
`traci5.FT.py`	Fixed Timing baseline (original)
`traci6.QL.py`	Q-Learning implementation (original)
`traci7.DQL.py`	Deep Q-Network implementation (original)
`Phuoc_ne.py`	DQN with Target Network
`policy_modify_action.py`	Policy-based action modification
`baseline.py`	Baseline comparison experiments

🎓 Skills Demonstrated

Reinforcement Learning: Q-Learning, Deep Q-Learning, Experience Replay, Target Networks
Deep Learning: CNN architecture design, TensorFlow/Keras implementation
Traffic Simulation: SUMO configuration, TraCI API integration
Software Engineering: Modular code design, experiment tracking, visualization
Research: Algorithm comparison, hyperparameter tuning, performance analysis

📚 References

Mnih, V., et al. (2015). Human-level control through deep reinforcement learning. Nature.
SUMO Documentation: https://sumo.dlr.de/docs/
TraCI Python API: https://sumo.dlr.de/docs/TraCI/Interfacing_TraCI_from_Python.html

👤 Author

[Your Name] - AI/ML Engineer

🎓 University of Information Technology (UIT) - Vietnam National University HCMC

📚 Course: Artificial Intelligence (CS106)

📧 Email: [your.email@example.com]

🔗 LinkedIn: linkedin.com/in/yourprofile

💻 GitHub: github.com/yourusername

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

⭐ If you find this project helpful, please give it a star!

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
config/sumo		config/sumo
docs		docs
result		result
saved_plots		saved_plots
scripts		scripts
src		src
.gitignore		.gitignore
LICENSE		LICENSE
OfficialMap.net.xml		OfficialMap.net.xml
Phuoc_ne.py		Phuoc_ne.py
Phuoc_value_based.py		Phuoc_value_based.py
PlotCharts.py		PlotCharts.py
README.md		README.md
baseline.py		baseline.py
e2_0.xml		e2_0.xml
e2_1.xml		e2_1.xml
e2_2.xml		e2_2.xml
e2_3.xml		e2_3.xml
policy_modify_action.py		policy_modify_action.py
requirements.txt		requirements.txt
single-intersection-gen.rou.xml		single-intersection-gen.rou.xml
single-intersection.net.xml		single-intersection.net.xml
sumo_run.py		sumo_run.py
test.add.xml		test.add.xml
test.net.xml		test.net.xml
test.netecfg		test.netecfg
test.rou.xml		test.rou.xml
test.sumocfg		test.sumocfg
traci5.FT.py		traci5.FT.py
traci6.QL.py		traci6.QL.py
traci7.DQL.py		traci7.DQL.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🚦 Traffic Light Control using Reinforcement Learning

🎯 Project Highlights

📊 Implemented Algorithms

🏗️ Project Architecture

🧠 Technical Details

Deep Q-Network Architecture

State Representation

Reward Function

Hyperparameters

🚀 Getting Started

Prerequisites

Environment Setup

Running the Experiments

📈 Results

Performance Comparison

Training Curves

🔧 Key Features

1. Custom Traffic Network

2. Flexible Action Space

3. Training Modes

📁 File Descriptions

Source Code (Refactored)

Original Implementation Files

🎓 Skills Demonstrated

📚 References

👤 Author

📄 License

About

Uh oh!

Releases

Packages

Languages

License

MinhMarks/TrafficLightControl_ReinforcementLearning

Folders and files

Latest commit

History

Repository files navigation

🚦 Traffic Light Control using Reinforcement Learning

🎯 Project Highlights

📊 Implemented Algorithms

🏗️ Project Architecture

🧠 Technical Details

Deep Q-Network Architecture

State Representation

Reward Function

Hyperparameters

🚀 Getting Started

Prerequisites

Environment Setup

Running the Experiments

📈 Results

Performance Comparison

Training Curves

🔧 Key Features

1. Custom Traffic Network

2. Flexible Action Space

3. Training Modes

📁 File Descriptions

Source Code (Refactored)

Original Implementation Files

🎓 Skills Demonstrated

📚 References

👤 Author

📄 License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages