Real-world autonomous vehicle datasets from challenging African and Asian terrains
DeepUbuntu is a comprehensive data collection, cleaning, and labeling platform specifically designed for autonomous vehicle development in challenging real-world conditions. Unlike traditional AV datasets that focus primarily on well-maintained roads in developed regions, DeepUbuntu captures the harsh realities of driving in Africa and Asia—unpaved roads, extreme weather, complex traffic patterns, and infrastructure variations that current autonomous vehicle systems struggle to handle.
DeepUbuntu is built on a microservices architecture designed for scalability, reliability, and real-time processing:
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ Web Portal │ │ Mobile Apps │ │ Dashcam API │
│ (Next.js) │ │ (React Native)│ │ (REST/GraphQL)│
└─────────────────┘ └─────────────────┘ └─────────────────┘
│ │ │
└───────────────────────┼───────────────────────┘
│
┌─────────────────┐
│ API Gateway │
│ (Kong/Nginx) │
└─────────────────┘
│
┌───────────────────────┼───────────────────────┐
│ │ │
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ Upload │ │ User │ │ Metadata │
│ Service │ │ Service │ │ Service │
└─────────────────┘ └─────────────────┘ └─────────────────┘
│ │ │
└───────────────────────┼───────────────────────┘
│
┌─────────────────┐
│ Processing │
│ Pipeline │
└─────────────────┘
│
┌───────────────────────┼───────────────────────┐
│ │ │
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ Anonymization │ │ Quality │ │ Segmentation │
│ Service │ │ Analysis │ │ Service │
└─────────────────┘ └─────────────────┘ └─────────────────┘
│ │ │
└───────────────────────┼───────────────────────┘
│
┌─────────────────┐
│ Storage │
│ Layer │
└─────────────────┘
│
┌───────────────────────┼───────────────────────┐
│ │ │
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ PostgreSQL │ │ MongoDB │ │ Redis │
│ (Metadata) │ │ (Analytics) │ │ (Cache) │
└─────────────────┘ └─────────────────┘ └─────────────────┘
- Languages: Go, Python, Node.js
- Frameworks: Gin (Go), FastAPI (Python), Express (Node.js)
- Databases: PostgreSQL, MongoDB, Redis
- Message Queue: Apache Kafka
- Containerization: Docker, Kubernetes
- Upload Service: Handles video uploads and chunking
- User Service: Authentication, authorization, and user management
- Metadata Service: GPS, temporal, and environmental data extraction
- Anonymization Service: AI-powered face and license plate blurring
- Quality Analysis Service: Video quality assessment and stability metrics
- Segmentation Service: Video categorization and scene analysis
- Frontend: React/Next.js on port 3000
- Styling: Tailwind CSS with custom design system
- State Management: Zustand for client-side state
- Real-time: WebSocket connections for live updates
deepubuntu/
├── app/ # Next.js app directory
│ ├── about/ # About page
│ ├── api/ # API routes
│ ├── blog/ # Blog pages
│ ├── careers/ # Careers page
│ ├── contact/ # Contact page
│ ├── enterprise/ # Enterprise solutions
│ ├── government/ # Government solutions
│ ├── legal/ # Legal pages
│ ├── products/ # Product pages
│ ├── research/ # Research pages
│ └── resources/ # Resource pages
├── components/ # React components
│ ├── canvas/ # 3D/WebGL components
│ ├── providers/ # Context providers
│ ├── sections/ # Page sections
│ └── ui/ # UI components
├── content/ # MDX content
│ ├── blog/ # Blog posts
│ └── products/ # Product documentation
├── docs/ # Documentation
├── lib/ # Utility libraries
│ ├── hooks/ # Custom React hooks
│ ├── stores/ # State management
│ └── utils/ # Utility functions
├── public/ # Static assets
│ └── models/ # 3D models
├── scripts/ # Build and deployment scripts
├── tests/ # Test files
└── README.md # This file
- Framework: Next.js 14 with App Router
- Language: TypeScript
- Styling: Tailwind CSS with custom design system
- 3D Graphics: Three.js for 3D models and animations
- State Management: Zustand for client-side state
- Real-time: WebSocket connections for live updates
- API: RESTful APIs with GraphQL support
- Authentication: OAuth 2.0 with JWT tokens
- Database: PostgreSQL for metadata, MongoDB for analytics
- Cache: Redis for session management and caching
- File Storage: S3-compatible object storage
- Message Queue: Apache Kafka for event processing
- Containerization: Docker with multi-stage builds
- Orchestration: Kubernetes for scaling and management
- Monitoring: Prometheus, Grafana, and ELK stack
- CI/CD: GitHub Actions with automated testing
- Deployment: Vercel for frontend, cloud-native backend
The DeepUbuntu platform processes data through a comprehensive pipeline:
- Upload: Users upload driving videos via web portal or dashcam API
- Privacy: Automatic face blurring and license plate anonymization
- Validation: Format verification and metadata extraction
- Quality Analysis: Video quality assessment and stability metrics
- Segmentation: Split videos into categorized segments
- Metadata: Extract GPS, temporal, and environmental data
- Storage: Organize data into commercial-ready datasets
- Analytics: Generate insights for contributors and customers
- Drag & Drop Upload: Easy video upload with progress tracking
- Coverage Mapping: Interactive map showing your contributions
- Reward System: Points and achievements for quality contributions
- Mobile Support: Upload directly from dashcam apps
- Resumable Uploads: Continue interrupted uploads
- Advanced Search: Find specific scenarios, locations, weather conditions
- Quality Guarantees: 99.5% annotation accuracy, sub-meter GPS precision
- Real-time Access: Live data feeds and streaming APIs
- Fast Access: REST & GraphQL APIs with real-time data feeds
- Privacy Compliant: GDPR compliant with built-in anonymization
- Automatic Anonymization: AI-powered face and license plate blurring
- End-to-End Encryption: All data encrypted in transit and at rest
- Access Control: Role-based permissions and audit logging
- OAuth 2.0: Secure authentication with role-based access control
- Rate Limiting: Protection against abuse and DDoS attacks
- GDPR Compliant: Full data protection regulation compliance
- Data Residency: Configurable data storage locations
- Audit Trails: Complete data access and modification logs
- Privacy by Design: Built-in privacy controls and defaults
- Upload Speed: 100MB/s sustained upload rates
- Processing Time: <5 minutes for 1GB video files
- API Response: <100ms average response time
- Uptime: 99.9% availability SLA
- Scalability: Auto-scaling to handle 10,000+ concurrent users
We welcome contributions from the community! Here are the main areas where you can help:
- Developers: Core platform development
- Researchers: ML model improvements
- Designers: UI/UX improvements and new features
- Community: Documentation, testing, feedback
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests for new functionality
- Submit a pull request
This project is licensed under the MIT License - see the LICENSE file for details.
- Email: thabhelo@deepubuntu.com
- Discord: DeepUbuntu Community
- Issues: GitHub Issues
- Documentation: docs.deepubuntu.com