DEMO VIDEO LINK - https://drive.google.com/file/d/1zsSJWoBGnxqsaaVBHJxpLhSxRpmXSI38/view?usp=sharing
This project is a web-based AI SQL assistant that allows users to interact with a database using natural language. Users can upload CSV files, which are converted into a searchable SQLite database. The Gemini 2.5 Flash LLM is used to convert natural language questions into SQL queries, validate them, execute them, and return human-readable answers.
The project is a full-stack application with a FastAPI backend and a Streamlit frontend.
- Backend (Render): https://sql-agent-mpka.onrender.com/
- Frontend (Streamlit): [https://aisqlagent.streamlit.app//)
The frontend reads the backend URL from
st.secrets["BACKEND_URL"].
app.py– FastAPI backend. Handles CSV uploads, SQL query generation, validation, execution, and converting results to natural language.frontend.py– Streamlit frontend. User interface for uploading CSVs and asking questions.requirements.txt– List of Python dependencies..env– Environment variables includingGOOGLE_API_KEY. Ignored by Git.uploaded.db– SQLite database storing uploaded CSV data.
-
CSV Upload Upload CSV files and store them as SQLite tables for querying.
-
Natural Language to SQL Uses Gemini 2.5 Flash LLM to generate SQL queries that respect the database schema.
-
SQL Validation Queries are validated and corrected before execution to ensure correctness and security.
-
Human-Readable Answers Query results are converted into natural language summaries for easy understanding.
-
Modular Workflow The backend uses a state graph workflow:
query_gen → query_check → execute_query → nl_outputquery_gen: Generate SQL from natural language.query_check: Validate and correct SQL.execute_query: Run SQL on SQLite.nl_output: Convert results into natural language.
- Python 3.11+
- A valid Google Gemini API key
- Clone the repository
git clone https://github.com/your-username/your-repo-name.git
cd your-repo-name- Create and activate a virtual environment
python -m venv venv
# Windows
venv\Scripts\activate
# macOS/Linux
source venv/bin/activate- Install dependencies
pip install -r requirements.txt- Set up environment variables
Create a
.envfile:
GOOGLE_API_KEY="YOUR_API_KEY_HERE"
- Run backend and frontend locally (optional) In separate terminals:
Backend:
uvicorn app:app --reloadFrontend:
streamlit run frontend.pyIn deployed mode, the frontend reads the backend URL from
st.secrets["BACKEND_URL"].
| Endpoint | Method | Description |
|---|---|---|
/upload |
POST | Upload CSV file and store as a table. |
/ask |
POST | Ask a natural language question. Returns human-readable answer. |
/schema |
GET | Return current database schema (tables + columns). |
/ |
GET | Basic welcome message. |
- Backend: Deploy on Render (or similar cloud platforms).
- Frontend: Deploy on Streamlit Community Cloud, linking your GitHub repository.
- Secrets: Add
BACKEND_URLin Streamlit Cloud secrets pointing to your Render backend.
The backend uses a state graph workflow to dynamically generate, validate, and execute SQL:
START → query_gen → query_check → execute_query → nl_output → END
-
Dynamic transitions:
- If LLM returns an error → regenerate SQL.
- If SQL is valid → execute query.
- Convert query results into natural language for users.
-
Upload a CSV file as a table.
-
Ask questions like:
"List the top 5 customers by total purchase amount." -
Receive SQL-generated results in natural language:
"The top 5 customers are Alice, Bob, Charlie, Dave, and Eve with respective totals of..."