Uber End-to-End Data Engineering Project

Overview

This project is a comprehensive End-to-End Data Engineering Pipeline that transforms raw Uber trip data into actionable insights. Built as part of my journey as a Computer Science student at Alamein International University (AIU), this project covers the entire data lifecycle: from cloud storage and orchestration to data modeling and visualization.

Project Architecture

The pipeline follows a modern data stack approach:

Data Source: Raw Uber CSV data stored in Google Cloud Storage (GCS).
Orchestration: Mage AI running on a GCP Compute Engine (VM) instance.
Processing: Data cleaning and transformation using Python (Pandas).
Data Modeling: Designing a Star Schema with Fact and Dimension tables.
Data Warehouse: Google BigQuery for high-performance analytics.
Final Layer: Custom SQL joins for the analytics table.
Visualization: Interactive Dashboard built with Google Looker Studio.

Tech Stack

Tool	Purpose
Python	Data Transformation & ETL Logic
Mage AI	Modern Data Pipeline Orchestration
GCP (Compute Engine)	Virtual Machine Hosting
GCP (GCS)	Raw Data Lake Storage
BigQuery	Cloud Data Warehousing
SQL	Analytics Table Construction
Looker Studio	BI & Dashboarding

Data Modeling (Star Schema)

To optimize query performance and maintain data integrity, the data was modeled into a Star Schema:

Fact Table: fact_table (Measures and FKs).
Dimension Tables: * datetime_dim
- passenger_count_dim
- trip_distance_dim
- rate_code_dim
- pickup_location_dim
- dropoff_location_dim
- payment_type_dim

Challenges Faced & Lessons Learned

Building this pipeline wasn't without its hurdles. Here’s how I tackled the technical challenges:

Environment Isolation (PEP 668):
- Issue: Encountered the externally-managed-environment error on Python 3.11 when installing GCP libraries.
- Fix: Managed the installation using the --break-system-packages flag to ensure the VM environment had the necessary BigQuery SDKs.
Mage Exporter Logic:
- Issue: The initial exporter block attempted to iterate through columns as tables, causing NameError and Table not found.
- Fix: Refactored the Python logic to correctly handle a dictionary of DataFrames, ensuring each dimension table was exported individually to BigQuery.
Geospatial Data Visualization:
- Issue: Looker Studio initially failed to recognize Latitude and Longitude fields.
- Fix: Reconfigured the Data Source field types to Geo coordinates to enable the Map visualizations.

Final Insights

The final analytics layer provides a deep dive into Uber's operations:

Revenue Analysis: ~$1.6M total revenue processed.
Geospatial Mapping: Identification of high-density pickup zones in New York.
Operational Efficiency: Average trip distances and payment method preferences.

Live Dashboard Link

Connect with Me

I'm Mohamed Amer, a 1st-year CS student at AIU, deeply interested in Cloud Infrastructure, Data Engineering.

GitHub: [https://github.com/mohamedamerdev-coder]
LinkedIn: [www.linkedin.com/in/mohamed-amer-46733833b]

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
Screenshot 2025-12-14 231413.png		Screenshot 2025-12-14 231413.png
analytics_query.sql		analytics_query.sql
load_uber_data.py		load_uber_data.py
transform_uber.py		transform_uber.py
uber.py		uber.py
uber_bq_exporter.py		uber_bq_exporter.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Uber End-to-End Data Engineering Project

Overview

Project Architecture

Tech Stack

Data Modeling (Star Schema)

Challenges Faced & Lessons Learned

Final Insights

Connect with Me

About

Uh oh!

Releases

Packages

Languages

License

mohamedamerdev-coder/Uber-Data-Engineering-GCP

Folders and files

Latest commit

History

Repository files navigation

Uber End-to-End Data Engineering Project

Overview

Project Architecture

Tech Stack

Data Modeling (Star Schema)

Challenges Faced & Lessons Learned

Final Insights

Connect with Me

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages