This project builds a simple yet functional E-commerce data pipeline, from raw product data ingestion and API serving to a comprehensive analytics dashboard. It demonstrates how to integrate Python applications (Flask API, data processing scripts, Streamlit dashboard) with a MySQL database.
- Product Data Ingestion: Fetches product data from multiple JSON sources, harmonizes it, and loads it into a MySQL database.
- Order Data Ingestion: Processes order data from a CSV file and inserts it into the MySQL database, maintaining relationships with products.
- Product Data API: A Flask-based REST API that serves product data, mimicking an external data source.
- Interactive Dashboard: A Streamlit web application for visualizing and analyzing key e-commerce metrics, product performance, and sales trends.
- MySQL Database: Stores all product and order information, acting as the central data repository.
- Python 3.x: Core programming language.
- Flask: For building the REST API server.
- Streamlit: For creating the interactive web dashboard.
- Pandas: For data manipulation and cleaning.
- Requests: For making HTTP requests to the API.
mysql-connector-pythonorPyMySQL: For connecting Python to MySQL.- MySQL / MariaDB: The relational database (typically via XAMPP).
- XAMPP: Provides Apache, MySQL (MariaDB), and PHPMyAdmin for easy local server management.
Follow these steps to get the project up and running on your local machine.
- Python 3.8+: Download from python.org.
- XAMPP: Download from apachefriends.org. Install it to manage your MySQL database and phpMyAdmin.
-
Clone the Repository (or Download):
git clone <your-repository-url> cd Ecommerce # Or navigate to your project directory if downloaded
-
Create a Virtual Environment: It's highly recommended to use a virtual environment to manage project dependencies.
python -m venv .venv
-
Activate the Virtual Environment:
- Windows:
.\.venv\Scripts\activate
- macOS/Linux:
source ./.venv/bin/activate
(You should see
(.venv)at the beginning of your terminal prompt.) - Windows:
-
Install Python Dependencies:
pip install -r requirements.txt
(If you don't have
requirements.txt, you'll need to install them manually:pip install flask streamlit pandas requests mysql-connector-python)
-
Start XAMPP Control Panel:
- Start the Apache module.
- Start the MySQL module. (Ensure it says "Running" and the light is green. If you face port conflicts, you might need to change MySQL's port in
my.inito3307as discussed in previous troubleshooting. If you change it, remember this port for your PythonDB_CONFIG.)
-
Access phpMyAdmin:
- Open your web browser and go to
http://localhost/phpmyadmin/.
- Open your web browser and go to
-
Create the Database and Tables:
- In phpMyAdmin, go to the SQL tab.
- Open your
mysql_setup.sqlfile (located in your project root) in a text editor. - Copy the entire content of
mysql_setup.sql. - Paste it into the SQL query box in phpMyAdmin.
- Click the "Go" button.
- This will create the
ecommerce_datadatabase and theproducts,orders, andorder_itemstables within it. - Important: If you faced "Access denied" errors for
root@localhostearlier, ensure therootuser is unlocked (typically by setting an empty password or a known password viaALTER USERorUPDATE mysql.userfrom the command line after stopping MySQL and restarting with--skip-grant-tables).
All your Python scripts (api_server.py, data_ingestion.py, app.py) need to know how to connect to your MySQL database.
- Open
api_server.py,data_ingestion.py, andapp.pyin PyCharm. - Locate the
DB_CONFIGdictionary in EACH of these three files. - Update
DB_CONFIGwith your MySQL credentials:DB_CONFIG = { 'host': 'localhost', 'database': 'ecommerce_data', 'user': 'root', # Use 'root' as the username 'password': '', # If root has no password (common XAMPP default), leave this as an empty string '' # Otherwise, put your root password here. 'port': 3306 # IMPORTANT: Change to 3307 if you changed MySQL's port in XAMPP! }
- Save all three Python files after making changes.
Your project uses products_source_1.json, products_source_2.json, and orders.csv from the data/ folder.
- Ensure
products_source_1.jsonandproducts_source_2.jsoncollectively contain allproduct_ids that are referenced inorders.csv. Iforders.csvmentions a product ID (e.g.,P006) that is not defined in your JSON product sources, you will get a foreign key error during data ingestion.- If you encounter a foreign key error, manually add the missing product definitions to one of your
products_source_X.jsonfiles. For example, iforders.csvmentionsP006, ensureP006is defined in one of the JSON files.
- If you encounter a foreign key error, manually add the missing product definitions to one of your
You need to run these components in separate terminal tabs in PyCharm, in a specific order:
This server provides the product data that your data_ingestion.py script will fetch.
- Open PyCharm's Terminal (usually at the bottom).
- Ensure your virtual environment is active.
- Run the Flask API server:
(You should see output indicating it's running on
python api_server.py
http://127.0.0.1:5000/) - Leave this terminal tab open and running.
This script populates your MySQL database with products and orders.
- Open a NEW Terminal tab in PyCharm (do NOT close the
api_server.pytab). - Ensure your virtual environment is active.
- Run the data ingestion script:
python data_ingestion.py
- Wait for the script to complete (it will return to the prompt). You should see messages indicating successful insertions. If you face errors, check your
DB_CONFIGor if you've correctly resolved the foreign key constraint by adding missing product definitions.
This is your interactive "website"!
- Open a NEW Terminal tab in PyCharm (do NOT close the
api_server.pytab). - Ensure your virtual environment is active.
- Run the Streamlit application:
streamlit run app.py
- Your default web browser should automatically open to
http://localhost:8501(or a similar address displayed in the terminal).
Once the Streamlit app is running in your browser, you can:
- View aggregated product data.
- Analyze sales trends over time.
- Explore product categories and brands.
- Gain insights into your e-commerce operations.
To stop the project components:
- Go to the terminal tabs where
api_server.pyandapp.pyare running. - Press
Ctrl+Cin each terminal to stop the respective servers. - You can then close the XAMPP Control Panel.
Enjoy exploring your E-commerce Product Data Aggregator & Analyzer!

