Upload spreadsheets. Ask in plain English. Watch 3 AI agents find the answer — live.
Try Live |
Full Demo |
Vibes Demo |
How It Works |
Tech Stack |
Architecture
Built solo in 48 hours for the Vibe Coding Hackathon 2026. Powered by NVIDIA NIM.
Most analytics tools make you drag and drop, write formulas, or learn SQL. MergeAI doesn't. You upload your CSV files, type a question like "which department spends the most on training?", and three AI agents collaborate in real-time to find the answer. No setup. No mapping. No SQL.
You upload two spreadsheets that have never seen each other. One has employee data, the other has training records. You type: "Compare training cost by department."
Here's what happens — and you watch it happen live:
Schema Map — see your table connections to make join queries:
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ Schema Agent │ ──→ │ SQL Agent │ ──→ │ Validator │
│ (Nano 8B) │ │ (253B Ultra)│ │ (Deterministic)│
│ │ ←── │ │ ←── │ │
│ Finds joins │ retry │ Writes SQL │ retry │ Checks results│
└──────────────┘ └──────────────┘ └──────────────┘
- Schema Agent reads both files, understands the columns, spots that
EmpIDin one file matchesEmployee IDin the other - SQL Agent writes a real PostgreSQL query — CTEs, JSONB extraction, proper JOINs — to merge the data across files
- Validator executes the query and checks the results. Zero rows? Case mismatch? Wrong column? It sends feedback and the agents retry. Up to 3 rounds of self-correction.
Results appear in a clean table with a plain English summary:
"The data shows a distribution of training outcomes, with the most frequent being 'Incomplete' at 775 instances, followed by 'Completed' at 770, 'Passed' at 739, and the least frequent being 'Failed' at 716."
The whole thing takes about 8 seconds.
Pie Chart — Training Outcome Distribution:
| Tool | What You Need To Do |
|---|---|
| Tableau | Manually drag-and-drop join configuration |
| Power BI | Create composite data models |
| Looker | Write LookML definitions |
| ChatGPT | Hope that in-memory pandas doesn't crash |
| MergeAI | Type one sentence |
Your data lives in a real PostgreSQL database. The queries are real SQL. The joins are real joins. Click "View SQL" to see exactly what the AI wrote — full transparency.
Table Preview — click any file name to browse your data before querying:
Not a single LLM call that hopes for the best. Three specialized agents with a feedback loop:
Round 1: Schema Agent analyzes → SQL Agent generates → Validator checks
↓ (if 0 rows or errors)
Round 2: Schema Agent re-analyzes with feedback → SQL Agent regenerates → Validator re-checks
↓ (if still failing)
Round 3: Final attempt with accumulated context → Best-effort result
| Agent | Model | Why |
|---|---|---|
| Schema Agent | Nemotron Nano 8B | Fast schema analysis, JSON output, ~200ms |
| SQL Agent | Nemotron Ultra 253B | Most accurate SQL generation, handles complex CTEs |
| Summary Agent | Nemotron Nano 8B | Quick NL summary of results |
Per NVIDIA docs: "detailed thinking off" system prompt disables reasoning traces for clean SQL output from 253B.
Every CSV file — any schema, any columns — gets stored the same way:
-- One table handles ALL CSV files
uploaded_rows (
file_id UUID, -- which file
row_data JSONB -- {"Name": "Alice", "Salary": "85000", "Dept": "Engineering"}
)
-- Agent-generated query (real example):
WITH employees AS (
SELECT row_data->>'EmpID' AS emp_id,
row_data->>'DepartmentType' AS dept
FROM uploaded_rows WHERE file_id = 'abc-123'
),
training AS (
SELECT row_data->>'Employee ID' AS emp_id,
(row_data->>'Training Cost')::NUMERIC AS cost
FROM uploaded_rows WHERE file_id = 'def-456'
)
SELECT dept, AVG(cost) AS avg_training_cost
FROM employees JOIN training ON LOWER(emp_id) = LOWER(emp_id)
GROUP BY dept ORDER BY avg_training_cost DESC;Five chart types generated automatically based on your query — bar, pie, line, scatter, and heatmap:
Heatmap — Average Training Cost by Department and Training Type:
Server-Sent Events stream agent status to the browser. Framer Motion animates each agent card through states:
idle → active (pulsing blue) → done (green) → or retry (orange) → back to active
AG-UI Protocol event naming: agent_start, agent_complete, round_retry, query_complete.
Line Chart — Average Training Cost by Month:
| Layer | Tech | Why |
|---|---|---|
| Vibe Coding | AdaL CLI | AI-assisted development, hackathon workflow |
| Framework | Next.js 15 (App Router) | React 19, server components, API routes |
| AI Models | NVIDIA NIM API | Nemotron 253B + Nano 8B via OpenAI-compatible SDK |
| Database | Neon PostgreSQL | Serverless HTTP mode, zero connection overhead |
| ORM | Drizzle v0.45.1 | Typed JSONB, lightest ORM, SQL-first |
| Streaming | SSE (ReadableStream) | Native, zero deps, real-time agent updates |
| Animation | Framer Motion | Agent card state transitions |
| Auth | Clerk | Sign-up/sign-in in 10 minutes, free tier |
| CSV Parsing | Papa Parse | Client-side, fast, handles any format |
| Deploy | Vercel | Auto-deploy from GitHub |
Demo data is pre-loaded — 3,000 employees + 3,000 training records. Click any example query and watch the agents work.
Then upload your own CSV files. Any schema. Any columns. Any data. The agents figure it out.
git clone https://github.com/lubobali/mergeAI.git
cd mergeAI
npm installCreate .env.local:
DATABASE_URL=your_neon_connection_string
NVIDIA_API_KEY=your_nvidia_nim_key
NEXT_PUBLIC_CLERK_PUBLISHABLE_KEY=your_clerk_key
CLERK_SECRET_KEY=your_clerk_secret
npx drizzle-kit push # create tables
npm run dev # start dev serverBuilt for Vibe Coding Hackathon 2026 with AdaL CLI





