Unity3D simulation environment for training autonomous driving agents using Proximal Policy Optimization (PPO). This project communicates with a Python backend via ZeroMQ to enable reinforcement learning training.
This Unity project provides the simulation environment where an autonomous driving agent learns to navigate. It sends sensor data to a Python training backend and receives steering commands in return.
Key capabilities:
- Physics-based 3D driving simulation
- 5-ray sensor system for obstacle detection
- Real-time ZeroMQ communication with training backend
- Reward collection and collision detection
- Episode management and automatic resets
- Unity 2022.3 LTS or higher
- Python training backend: PPO_RL_AutoDRV_Compute_Backend
- Clone this repository:
git clone https://github.com/Spectrewolf8/PPO_AutoDRW_Unity3d_GameWorld.git-
Open the project in Unity Hub
-
Install the Python backend (required):
git clone https://github.com/Spectrewolf8/PPO_RL_AutoDRV_Compute_Backend.git
cd PPO_RL_AutoDRV_Compute_Backend
pip install -r requirements.txt- Start the Python backend server:
cd PPO_RL_AutoDRV_Compute_Backend
python app.py-
Open the Unity project and press Play
-
The simulation will connect to the backend at
127.0.0.1:65432
Unity sends sensor data to the Python backend via ZeroMQ and receives steering commands:
Unity (Client) <--ZeroMQ--> Python Backend (Server)
| |
|-- Sends: Ray distances |-- PPO Model
|-- Sends: Speed |-- Training/Inference
|-- Sends: Collisions |-- Checkpoint System
| |
|-- Receives: Steering (-1/0/1)
Communication: REQ/REP pattern over tcp://127.0.0.1:65432
Assets/
├── Scenes/ # Unity scenes
├── Scripts/ # C# scripts
│ ├── CarController.cs
│ ├── CarRaycastsController.cs
│ ├── CommunicationController.cs
│ ├── GameController.cs
│ ├── RewardController.cs
│ ├── CarRespawnController.cs
│ └── OverviewCameraController.cs
├── Prefabs/ # Prefabs
├── Materials/ # Materials
└── models/ # 3D models
CarController.cs - Vehicle physics and steering
CarRaycastsController.cs - 5-ray sensor system for obstacle detection
CommunicationController.cs - ZeroMQ client for backend communication
GameController.cs - Game state and episode management
RewardController.cs - Collectible reward items
CarRespawnController.cs - Episode reset logic
- Protocol: ZeroMQ REQ/REP
- Address: 127.0.0.1:65432
- Format: JSON messages
{
"rays": [7.0, 4.5, 4.5, 3.5, 3.5],
"ray_hits": [0, 1, 0, 1, 0],
"speed": 1.25,
"collision": false,
"reward_collected": 0,
"done": false,
"episode": 1
}{
"type": "action",
"steering": 0
}Steering values: -1 (left), 0 (straight), 1 (right)
- Ray 0: Forward (max 7.0 units)
- Ray 1: Forward-Left (max 4.5 units)
- Ray 2: Forward-Right (max 4.5 units)
- Ray 3: Right (max 3.5 units)
- Ray 4: Left (max 3.5 units)
See CommunicationDesign.md for complete protocol specification.
Environment parameters are synchronized from the Python backend during connection. To modify settings, edit config.json in the backend repository.
This project requires the Python training backend:
PPO_RL_AutoDRV_Compute_Backend
The backend provides:
- PPO reinforcement learning algorithm
- Training and inference modes
- Model checkpointing
- Gymnasium environment interface
- ZeroMQ communication server
