Pymaceuticals Drug Study Analysis is a Python-based project that analyzes the effects of various drug regimens on tumor growth in mice. Using data analysis and visualization, this project identifies trends, compares drug performance, and provides insights into the effectiveness of specific treatments.
The project uses Matplotlib for data visualization and pandas for data analysis, executed within a Jupyter Notebook.
The project includes the following datasets:
| File Name | Description |
|---|---|
Mouse_metadata.csv |
Contains metadata on individual mice used in the study, including unique IDs, sex, age, and weight. |
Study_results.csv |
Contains results of the drug study, including tumor size over time for each drug regimen. |
pymaceuticals.ipynb |
Jupyter Notebook performing data analysis and visualizations. |
-
Data Cleaning:
- Merges
Mouse_metadata.csvandStudy_results.csv. - Removes duplicate and corrupted records.
- Merges
-
Exploratory Data Analysis:
- Summarizes tumor growth data for each drug regimen.
- Analyzes the number of mice tested under each drug.
-
Statistical Analysis:
- Calculates summary statistics such as mean, median, and standard deviation for tumor sizes.
- Compares drug performance using box plots, line charts, and scatter plots.
-
Visualization:
- Creates clear visualizations to illustrate tumor size over time for different drugs.
- Highlights outliers and trends in the data.
-
Prerequisites:
- Python 3.x
- Required libraries (if any) listed in
requirements.txt.
-
Setup:
-
Clone this repository or download the project files.
-
Install dependencies (if needed):
pip install -r requirements.txt
-
- Tumor Growth:
- Drug regimens such as Capomulin and Ramicane are more effective in reducing tumor size compared to other treatments.
- Mouse Weight vs. Tumor Size:
- Tumor size correlates positively with mouse weight in certain regimens, as seen in scatter plot analyses.
- Drug Performance:
- Visualizations like box plots highlight that Capomulin and Ramicane have the lowest variance and smallest final tumor sizes.