GitHub - daemon966/Netflix_EDA: A full exploratory data analysis (EDA) of Netflix’s global content catalog. The project includes cleaning and transforming the dataset, and building an interactive Power BI dashboard to uncover key patterns such as genre distribution, release trends, and content ratings. Tech Stack: Power BI, SQL, Python, Pandas

📊 Netflix Data Analysis Project

This project explores and analyzes Netflix's dataset using Power BI and Python. It includes a complete data cleaning workflow, feature extraction, and sentiment analysis, followed by insightful visualizations.

🧹 Data Cleaning and Preprocessing Steps

Column Profiling Inspected column data types and value distributions. Identified key fields like type, title, director, cast, country, release_year, rating.
Dealing with Missing Values Detected nulls in fields such as director, cast, country. Blank entries were standardized using Pandas.
Encoding Nulls Replaced empty strings and whitespace with NaN for consistency.
Imputing Missing Values Filled missing country and director with 'Unknown'. Rows with critical missing data were dropped or flagged.
Working with Dates Converted date_added to datetime type. Extracted year_added, month_added for time-based analysis.
Adding New Columns Derived columns: content_age = release_year - year_added. Added primary_country, primary_genre from comma-separated lists.
Splitting / Extracting Data Split multivalue fields like cast, genres, and country.
Extracting First Item Used .str.split(',').str[0] to extract the first country/genre/actor for analysis.
Text / Sentiment Analysis Cleaned description text using NLP. Performed sentiment analysis using TextBlob/VADER.
Filtering Unnecessary Data Removed redundant columns and low-quality rows. Focused on movies and TV shows with meaningful metadata.

📊 Visual Analysis with Power BI ✅ Count of Shows by Type Movies: 1.43K (55.42%) TV Shows: 1.15K (44.58%)

✅ Sum of release_year by Country Visualized content production over time by geography. Countries like USA, India, UK, Japan dominate content count.

✅ Rating Distribution by Type Explored how ratings vary across Movies and TV Shows.

🔧 Tools Used Power BI for data visualization Jupyter Notebook / Colab for data analysis

📌 Dataset Source Netflix Movies and TV Shows – Kaggle

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
Netflix_project.pbix		Netflix_project.pbix
Netflix_project.pdf		Netflix_project.pdf
README.md		README.md
netflix_titles.csv		netflix_titles.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Uh oh!

Releases

Packages

daemon966/Netflix_EDA

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages