This project focuses on cleaning, exploring, and analyzing US household income data using SQL. It aims to resolve data inconsistencies, uncover regional income trends, and provide insights into household income distributions across states and cities.
-
Data Cleaning:
- Resolve duplicate records.
- Standardize inconsistent entries (e.g., state names, place types).
- Address missing or incorrect data fields.
-
Exploratory Data Analysis (EDA):
- Analyze land and water area distributions by state.
- Investigate household income patterns (mean and median) across regions.
- Highlight disparities in income based on location and demographic details.
-
Data Integration:
- Combine datasets to enrich analysis and derive deeper insights.
-
USHouseholdIncome.csv:- Contains geographical and demographic household income data.
- Includes details like state, county, city, area type, and land/water area.
-
USHouseholdIncome_Statistics.csv:- Provides statistical metrics like mean and median household incomes.
- Duplicate Removal: Ensures unique records for accurate analysis.
- Inconsistency Fixes: Standardizes state names, place types, and other fields.
- Missing Data Handling: Updates empty fields with relevant information.
- Land & Water Area: Identifies states with the largest areas.
- Income Trends: Explores average income distributions across states and cities.
- Regional Disparities: Highlights significant income gaps.
datasets/:USHouseholdIncome.csvUSHouseholdIncome_Statistics.csv
scripts/:data_cleaning.sql: SQL scripts for data cleaning and preparation.eda_analysis.sql: SQL scripts for exploratory data analysis.
README.md: Project overview and instructions.
- Import Datasets: Load the CSV files into your SQL database.
- Execute Scripts:
- Run
data_cleaning.sqlfor cleaning and preparation. - Run
eda_analysis.sqlfor exploratory analysis.
- Run
- Visualize Results: Use SQL query outputs to generate insights or visualizations.
- States with the largest land and water areas include Texas and Alaska.
- Regions like Puerto Rico exhibit significantly lower median incomes.
- Urban areas show higher mean incomes compared to rural and suburban regions.
For questions or suggestions, feel free to reach out!
Linkedin: https://www.LinkedIn.com/in/matan-nafshi
Email: matannaf@gmail.com