MeChatBot — Gradio + LangGraph RAG (Vector → Optional GraphRAG)

A lightweight AI assistant service with a Gradio UI, LangGraph workflow orchestration, and RAG retrieval (Chroma via LlamaIndex).
Local development runs with Ollama, while production can use the OpenAI API.

✅ Current: Vector RAG (LlamaIndex Retriever → Chroma Vector DB)
🔜 Optional: KG-augmented RAG / GraphRAG (Graph Retriever → Neo4j Knowledge Graph)

Architecture

User (Browser)
      │
      ▼
┌───────────────────────────────┐
│            Gradio UI           │
│  - Question input              │
│  - Answer + Citations output   │
└───────────────┬───────────────┘
                │
                ▼
┌───────────────────────────────┐
│          LangGraph             │
│  Workflow Orchestration        │
│  - State management            │
│  - (optional) retry/branching  │
└───────────────┬───────────────┘
                │
                ▼
┌─────────────────────────────────────────────────────┐
│                 Retrieval Layer                      │
│                                                     │
│  (Current) Vector RAG                               │
│   LlamaIndex Retriever ─────▶ Chroma (Vector DB)     │
│                                                     │
│  (Future, Optional) KG-augmented RAG                 │
│   Graph Retriever ─────────▶ Neo4j (Knowledge Graph) │
└───────────────┬─────────────────────────────────────┘
                │
                ▼
┌───────────────────────────────┐
│          LLM Layer             │
│  Prompting + Invocation        │
│                               │
│  - Local Dev: Ollama           │
│  - Production: OpenAI API      │
└───────────────┬───────────────┘
                │
                ▼
        Answer + Citations

What “RAG” means here (no fine-tuning)

This project does not fine-tune the model.
Your documents are indexed into a vector database and later retrieved and injected into the prompt at question time.

✅ “Remembers” via retrieval (Chroma)
❌ Does not change model weights (no training / fine-tuning)

Project layout

Typical files in this repo:

.
├─ app.py              # Gradio UI entrypoint
├─ graph.py            # LangGraph workflow (decide → retrieve → generate)
├─ rag_vector.py       # Vector RAG retrieval (Chroma via LlamaIndex)
├─ rag_graph.py        # (Optional) GraphRAG retrieval (Neo4j)
├─ llm_factory.py      # LLM provider factory (Ollama vs OpenAI)
├─ ingest.py           # One-time (or on-change) docs → Chroma indexing
└─ data/docs/          # Your knowledge base source documents (md, txt, etc.)

Script responsibilities

`ingest.py`

Indexes your knowledge base into Chroma.

Reads files under data/docs/
Chunks documents and computes embeddings
Upserts embeddings into Chroma (persistent storage)

Run this before app.py (and again whenever docs change).

`rag_vector.py`

Query-time retrieval against Chroma:

Connects to Chroma (CHROMA_DIR, COLLECTION_NAME)
Retrieves top-k relevant chunks using LlamaIndex retriever
Returns:
- context (joined passages)
- sources (source path + score per passage)

`rag_graph.py` (optional)

GraphRAG retrieval for entity/relationship-centric queries:

Uses Neo4j as a Knowledge Graph store
Produces additional context + sources to merge with vector results

`graph.py`

LangGraph orchestration:

node_decide: switches GraphRAG on/off via USE_GRAPH_RAG
node_vector_retrieve: runs Vector RAG
node_graph_retrieve: runs GraphRAG (optional)
node_generate: prompts the LLM using retrieved context
and forces exactly one Citations: block in the final output.

`llm_factory.py`

Provider switching:

Local dev: ChatOllama(...) (requires Ollama running)
Production: OpenAI chat model (requires OPENAI_API_KEY)

`app.py`

Gradio UI:

Chat input
Chat output with citations
CSS tuned to avoid “double scroll” (one scroll area inside the chat only)

Quickstart

1) Create a virtualenv and install deps

python -m venv venv
source venv/bin/activate
pip install -r requirements.txt

2) Start Ollama (local dev)

ollama serve
# Pull your chat model (example)
ollama pull llama3.1:8b
# Pull embedding model used by ingest (example)
ollama pull nomic-embed-text

3) Index documents into Chroma

python ingest.py

4) Run the app

python app.py

Environment variables (`.env`)

Minimal local dev:

LLM_PROVIDER=ollama
USE_GRAPH_RAG=false

CHROMA_DIR=./storage/chroma
COLLECTION_NAME=mechatbot

OLLAMA_BASE_URL=http://localhost:11434
OLLAMA_MODEL=llama3.1:8b
OLLAMA_EMBED_MODEL=nomic-embed-text

MAX_CITATIONS=2

Production (OpenAI):

LLM_PROVIDER=openai
OPENAI_API_KEY=YOUR_KEY

⚠️ Never commit .env to Git.

Troubleshooting

“Connection refused” to `localhost:11434`

Ollama is not running:

ollama serve

404 “model not found”

Pull the model name you configured:

ollama pull llama3.1:8b

Citations look unrelated / too many files

Lower top_k in graph.py / rag_vector.py
Use MAX_CITATIONS=2 (or 1)
Add a similarity threshold in rag_vector.py (optional)

Notes on privacy

RAG indexing stores your docs locally (Chroma).
If you use OpenAI in production, retrieved text may be sent to the API at inference time.
With Ollama (local), inference stays on your machine.

License

Add your preferred license here.

Remove Chroma and get new KB

rm -rf ./storage/chroma

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MeChatBot — Gradio + LangGraph RAG (Vector → Optional GraphRAG)

Architecture

What “RAG” means here (no fine-tuning)

Project layout

Script responsibilities

`ingest.py`

`rag_vector.py`

`rag_graph.py` (optional)

`graph.py`

`llm_factory.py`

`app.py`

Quickstart

1) Create a virtualenv and install deps

2) Start Ollama (local dev)

3) Index documents into Chroma

4) Run the app

Environment variables (`.env`)

Troubleshooting

“Connection refused” to `localhost:11434`

404 “model not found”

Citations look unrelated / too many files

Notes on privacy

License

Remove Chroma and get new KB

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
data/docs		data/docs
storage		storage
.env		.env
.env.local		.env.local
.gitignore		.gitignore
Dockerfile		Dockerfile
README.MD		README.MD
app.py		app.py
graph.py		graph.py
ingest.py		ingest.py
llm_factory.py		llm_factory.py
rag_graph.py		rag_graph.py
rag_vector.py		rag_vector.py
requirements.txt		requirements.txt
start.sh		start.sh

EunbiYoon/meChatBot

Folders and files

Latest commit

History

Repository files navigation

MeChatBot — Gradio + LangGraph RAG (Vector → Optional GraphRAG)

Architecture

What “RAG” means here (no fine-tuning)

Project layout

Script responsibilities

ingest.py

rag_vector.py

rag_graph.py (optional)

graph.py

llm_factory.py

app.py

Quickstart

1) Create a virtualenv and install deps

2) Start Ollama (local dev)

3) Index documents into Chroma

4) Run the app

Environment variables (.env)

Troubleshooting

“Connection refused” to localhost:11434

404 “model not found”

Citations look unrelated / too many files

Notes on privacy

License

Remove Chroma and get new KB

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

`ingest.py`

`rag_vector.py`

`rag_graph.py` (optional)

`graph.py`

`llm_factory.py`

`app.py`

Environment variables (`.env`)

“Connection refused” to `localhost:11434`

Packages