A hands-on collection of data engineering projects. The module structure is inspired by the Data Engineering Zoomcamp from DataTalks.Club. All implementations are original work.
Refer to the modules below for covered topics and tools.
- Python ingestion with polars and pandas
- Rust data ingestion
- ELT Ingestion with Airbyte
- data load tool (dlt)
- IaC with Terraform (Google Cloud Platform)
- Workflow orchestration with Airflow 3.x
- Workflow orchestration with Airflow 2.x
- Workflow orchestration with Prefect
- BigQuery and dbt
- Databricks and dbt
- Redshift and dbt
- ClickHouse and dbt
- PostgreSQL and dbt
- DuckDB and dbt
- Data visualization with Superset/Metabase
- PySpark
- Spark + Kotlin API
- Spark (Scala)