A complete SQL data cleaning and analysis project using MySQL to analyze global company layoffs from 2020-2023.
This project takes messy, real-world layoffs data and cleans it up using SQL, then finds interesting patterns and insights about which companies, industries, and locations were most affected by layoffs.
Source: Layoffs Dataset from Kaggle
- Size: 2,300+ layoff records
- Time: 2020-2023
- Coverage: Companies worldwide
- Industries: Tech, Finance, Retail, Healthcare, and more
- Found and removed duplicate records
- Used SQL window functions to identify copies
- Cleaned company names (removed extra spaces)
- Fixed industry names (made "Crypto" consistent)
- Fixed country names (removed dots from "United States.")
- Changed date format from text to proper dates
- Filled in missing industry info when possible
- Removed records that had no useful layoff numbers
- Data Cleaning: Removing duplicates, fixing messy data
- Window Functions: ROW_NUMBER(), RANK(), SUM() OVER()
- Joins: Connecting tables to fill missing data
- Date Functions: Converting text to dates
- Aggregation: GROUP BY, SUM(), COUNT(), MAX()
- CTEs: Common Table Expressions for complex queries
📦 layoffs-sql-analysis
├── 📄 README.md (This file)
├── 📂 data/
│ └── 📄 layoffs.csv (Original dataset)
├── 📂 sql/
│ ├── 📄 data_cleaning.sql (Cleaning queries)
│ └── 📄 eda.sql (Analysis queries)
└── 📂 images/
├── 📷 top_companies.png (Results screenshots)
├── 📷 top_locations.png
├── 📷 top_industries.png
├── 📷 max_layoffs.png
└── 📷 complete_shutdowns.png
- MySQL installed on your computer
- MySQL Workbench (makes it easier)
- Download the files from this repository
- Open MySQL Workbench
- Create a new database called
world_layoff - Import the
layoffs.csvfile as a table calledlayoffs - Run the
data_cleaning.sqlfile first - Run the
eda.sqlfile second to see the analysis
- How to clean messy real-world data
- Advanced SQL techniques for data analysis
- Finding business insights from raw data
- Documenting and presenting data projects
This project shows I can:
- ✅ Take messy data and make it clean and usable
- ✅ Write complex SQL queries to find insights
- ✅ Present findings in a clear, understandable way
- ✅ Work with real business data to solve problems
This project is part of my data science portfolio, showing my SQL skills and ability to work with real-world data.




