Student Details: Name - Sejal Sontakke Sem - 6, Sec- 'C' PRN - 22070521101
Overview This repository contains the implementation of various Data Warehousing and Mining (DWM) practicals completed as part of the Semester VI curriculum. The practicals focus on data preprocessing, transformation, OLAP operations, and key machine learning algorithms used in data mining. Practicals Implemented: 1️⃣ Handling Missing Values in Python 🛠️ Implemented various techniques to handle missing values, such as mean/mode imputation, interpolation, and removal. 2️⃣ Data Flow Transformations in ETL ⚙️ Implemented common ETL (Extract, Transform, Load) transformations for data preprocessing. Included filtering, aggregation, and schema mapping techniques. 3️⃣ OLAP Operations on a Multi-Dimensional Data Cube 📊 Explored Online Analytical Processing (OLAP) operations such as roll-up, drill-down, slice, and dice. Worked with multi-dimensional data for analytical insights. 4️⃣ Apriori Algorithm for Frequent Itemset Mining 🛒 Implemented the Apriori algorithm to discover frequent itemsets in transactional data. Used support and confidence thresholds to generate association rules. 5️⃣ Naïve Bayes Algorithm 🤖 Implemented the Naïve Bayes classifier for text classification and probabilistic prediction tasks. K-Nearest Neighbors (KNN) Algorithm 📌 Implemented the KNN algorithm for classification tasks. Evaluated model performance with different values of K. 7️⃣ K-Means Clustering Algorithm 🔍 Implemented K-Means clustering for unsupervised learning. Visualized cluster formation using data visualization techniques. 8️⃣ Decision Tree Algorithm 🌲 Implemented a decision tree model for classification tasks. Used information gain and entropy to build decision rules. 9️⃣ Linear Regression 📈
Implemented linear regression to analyze relationships between dependent and independent variables. Evaluated model performance using RMSE and R² scores.
-
How to Run the Programs Clone the repository: git clone https://github.com/yourusername/dwm-practicals.git cd dwm-practicals
-
Install dependencies: pip install pandas numpy scikit-learn matplotlib
-
Run individual scripts using Python: python missing_values.py python etl_transformations.py python olap_operations.py python apriori.py python naive_bayes.py python knn.py python kmeans.py python decision_tree.py python linear_regression.py
Note: This repository is maintained as part of the Semester VI DWM Practical coursework.