Question?
- How to select the valuable airbnbn home when travel to the NYC?
Goal:
- Applied the Apache Airflow directed acyclic graphs (DAGs) to build data pipelines on NYC open data (park, shooting, hot_spot, hotel, public housing) and Airbnb housing data, followed by data minipulation, analysis, and visualization.
Flowchart
-
Original sources: NYC OPEN DATA, Airbnb dataset (from Insider Airbnb)
-
Get requests and download the sources
-
Preliminary Data cleaning and manipulation
-
Import to SQL database (MySQL/PostgreSQL)
-
Load data from database, and use jupyter notebook to show analysis and visualization (run in both localhost and AWS (EC2, RDS, S3))

