Summary: Analyzed heart attack prediction data from India using MySQL for data querying and Python (Pandas, Matplotlib, Seaborn) for data processing, analysis, and visualization. The project explores age distribution across states, gender-based health insights, lifestyle risk factors (smoking, stress), regional heart attack risk, and income-health correlations. Key findings provide a data-driven understanding of cardiovascular risk patterns across Indian states.
Tools Used: Database Queries: MySQL (for patient counts, risk factors, demographic analysis) Data Processing & Analysis: Python (pandas) Visualization: Matplotlib, Seaborn
Key Insights: Mizoram reported the highest average heart attack risk across all states. Punjab showed the highest average cholesterol levels. Patients with a history of heart attacks had an average annual income of ₹10.27 lakh, indicating a potential correlation between economic class and healthcare access. Gender distribution in the dataset leaned slightly toward male patients (55%). Average stress levels were comparable across genders, with females having a slightly higher average stress score. Smoking status showed only a slight increase in average heart attack risk, though lifestyle combinations (e.g., smoking + diabetes) could require further study. Dataset:
Source: Heart Attack Prediction Dataset (India) Size: 10,000 patient records Features: Demographics, health conditions, lifestyle habits, medical history, and economic factors.
Skills Demonstrated: ✅ Data Cleaning & Preprocessing ✅ SQL Query Writing (Aggregations, Filtering, Grouping) ✅ Data Analysis with pandas ✅ Data Visualization Power BI + matplotlib & seaborn ✅ Health Data Exploration and Risk Analysis