Skip to content

A text-centric multimodal RAG system with knowledge graph reasoning for sub-4B LLMs

License

Notifications You must be signed in to change notification settings

Rayen-Hamza/Klippy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

62 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Klippy

KlippyπŸ“Ž

Paperclips Organized Your Documents Since 1867.
Now They Understand Them.

A text-centric multimodal RAG system with knowledge graph reasoning for sub-4B LLMs

Python FastAPI Electron TypeScript React

Qdrant Neo4j Ollama Google ADK

spaCy Whisper BLIP


🎯 What is Klippy?

Klippy is a local-first AI assistant that remembers everything. It combines:

  • πŸ” Multimodal RAG ; Search across text, images, and audio with one query
  • 🧠 Knowledge Graph Reasoning ; Neo4j-powered ontological reasoning for complex queries
  • ⚑ Prompt Chaining ; Google ADK agents orchestrate multi-step reasoning pipelines
  • πŸ’Ύ Personal Memory ; Your data stays local, your assistant gets smarter

Architecture Philosophy: "Graph-as-Brain, LLM-as-Mouth" ; All reasoning is pre-computed via deterministic graph logic. The LLM only narrates the answer. This allows sub-4B models like Qwen2.5:3b to perform like much larger models.


✨ Features

🎨 Multimodal Ingestion

  • πŸ“ Text ; TXT, MD, PDF, code files
  • πŸ–ΌοΈ Images ; BLIP captioning + Tesseract OCR
  • 🎡 Audio ; Whisper transcription
  • πŸ”„ Differential Updates ; Skip unchanged files via content hashing

πŸ” Unified Search

  • Single vector space for ALL content (384-dim)
  • One query searches text, images, and audio
  • Filter by content type, source, or entity
  • HNSW indexing for sub-millisecond search

🧠 Knowledge Graph Reasoning

  • Entity extraction with spaCy NER
  • Relationship mining (Subject β†’ Predicate β†’ Object)
  • Multi-hop graph traversal
  • Causal chain analysis
  • Pre-computed reasoning chains for small LLMs

πŸ€– Agent Architecture (Google ADK)

  • Orchestrator Agent ; Routes and coordinates
  • Qdrant Agent ; Semantic search specialist
  • Neo4j Agent ; Knowledge graph queries
  • Prompt Chain Agent ; Multi-step reasoning pipeline

πŸ–₯️ Desktop App (Electron)

  • Native Windows/macOS/Linux app
  • Real-time chat interface
  • Source file previews with images
  • Confidence scores and entity badges

βš™οΈ Context-Aware Responses

  • Ontological enrichment before RAG
  • Smart context truncation for small LLMs
  • Confidence scoring per reasoning step
  • Source attribution with file paths

πŸ—οΈ Architecture

Klippy Architecture

Data Sources


πŸ› οΈ Tech Stack

Category Technologies Purpose
Backend FastAPI Python Pydantic REST API, validation, async I/O
Frontend Electron React TypeScript Vite Desktop app, chat UI
Vector DB Qdrant Semantic search, HNSW indexing
Graph DB Neo4j Knowledge graph, Cypher queries
LLM Ollama Qwen Local inference
Agents Google ADK LiteLLM Agent orchestration
Embeddings MiniLM Text embeddings (384-dim)
Vision BLIP Tesseract Image captioning, OCR
Speech Whisper Speech-to-text
NLP spaCy NER, relationship extraction
DevOps Docker uv Containers, fast packaging

πŸš€ Quick Start

Prerequisites

  • Python 3.12+
  • Docker Desktop (for Qdrant & Neo4j)
  • Ollama (for local LLM)
  • Node.js 18+ (for frontend)
  • uv package manager (recommended)

1️⃣ Clone & Install

git clone https://github.com/Rayen-Hamza/Klippy.git
cd Klippy

# Install Python dependencies
uv sync

# Download spaCy model
uv run python -m spacy download en_core_web_sm

2️⃣ Start Databases

docker compose up -d

This starts:

  • Qdrant on localhost:6333 (Vector DB)
  • Neo4j on localhost:7474 (Graph DB, password: changeme)

3️⃣ Start Ollama

# In a separate terminal (or it may already be running)
ollama serve

# Pull the model (first time only)
ollama pull qwen2.5:3b

4️⃣ Start the API

uv run uvicorn app.main:app --host 0.0.0.0 --port 8000

πŸ“š API docs: http://localhost:8000/docs

5️⃣ Start the Desktop App (Optional)

cd frontend
npm install
npm start

πŸ“‘ API Reference

Ingestion

Endpoint Method Description
/ingest/text POST Upload text/PDF/markdown
/ingest/image POST Upload image β†’ caption + OCR
/ingest/audio POST Upload audio β†’ transcribe
/ingest/directory POST Batch ingest from path

Search

Endpoint Method Description
/search POST Unified search ; all modalities
/search/by-type/{type} POST Filter by text/image/audio
/search/filters/by-entity GET Find content by entity

Agents (Google ADK)

Endpoint Method Description
/agent/chat POST Chat with orchestrator agent
/agent/sessions GET List active sessions
/agent/agents GET List available agents

Reasoning

Endpoint Method Description
/reasoning/query POST Graph reasoning β†’ LLM prompt
/reasoning/ingest POST Ingest document to graph

πŸ§ͺ Testing

# Run all tests
uv run pytest

# Run with coverage
uv run pytest --cov=app

# Run specific test
uv run pytest tests/test_embeddings.py -v

πŸ“ Project Structure

Klippy/
β”œβ”€β”€ app/                          # FastAPI backend
β”‚   β”œβ”€β”€ agents/                   # Google ADK agents
β”‚   β”‚   β”œβ”€β”€ orchestrator.py       # Root agent (router)
β”‚   β”‚   β”œβ”€β”€ qdrant_agent.py       # Vector search specialist
β”‚   β”‚   β”œβ”€β”€ neo4j_agent.py        # Knowledge graph specialist
β”‚   β”‚   └── prompt_chain.py       # Prompt chaining pipeline
β”‚   β”œβ”€β”€ models/                   # Pydantic models
β”‚   β”œβ”€β”€ routes/                   # API endpoints
β”‚   β”œβ”€β”€ services/                 # Business logic
β”‚   β”‚   β”œβ”€β”€ embeddings/           # Embedding strategies
β”‚   β”‚   β”œβ”€β”€ processing/           # Text/Image/Audio processors
β”‚   β”‚   └── storage/              # Qdrant manager
β”‚   └── config.py                 # Settings
β”œβ”€β”€ frontend/                     # Electron desktop app
β”‚   └── src/
β”‚       β”œβ”€β”€ main/                 # Electron main process
β”‚       └── renderer/             # React UI
β”œβ”€β”€ tests/                        # pytest tests
β”œβ”€β”€ docker-compose.yml            # Qdrant + Neo4j
└── pyproject.toml                # Python dependencies

πŸ”§ Configuration

Create a .env file:

# Qdrant
QDRANT_HOST=localhost
QDRANT_PORT=6333

# Neo4j
NEO4J_URI=bolt://localhost:7687
NEO4J_USER=neo4j
NEO4J_PASSWORD=changeme

# Ollama (Local LLM)
LLM_PROVIDER=ollama
LLM_MODEL=qwen2.5:3b
LLM_BASE_URL=http://localhost:11434/v1

# Processing
TEXT_CHUNK_SIZE=512
TEXT_CHUNK_OVERLAP=50
LOG_LEVEL=INFO

πŸ† Hackathon Team

CrèmeTartinéDangereuse
Built with πŸ’” and β˜•

πŸ“œ License

MIT License ; see LICENSE for details.


⭐ Star this repo if you find it useful!

About

A text-centric multimodal RAG system with knowledge graph reasoning for sub-4B LLMs

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors 4

  •  
  •  
  •  
  •