GitHub - gittycat/ragbench: This is a locally hosted and partially modular RAG system used to evaluate different techniques as I come across them. It is not aimed at production.

About

This projects implements a RAG AI assistant that searches your organisation’s trusted content and answers questions using it. It delivers more accurate, relevant responses grounded in your data than what is obtained by simple LLM apps like ChatGPT.

RAGs have become the most common application of AI in enterprise environments. This specific RAG focuses on two core features: data privacy and observability.

Data Privacy

Option to run fully on-prem (locally). No calls outside the intranet needed. For decent performance, this requires a dedicated server spec'd for large open source models.
Frontier cloud models can also be used. Privacy is then ensured by performing data anynimization on any request to the cloud and inserting back the redacted data on responses.

Observability

This covers both measures of the quality of the data (accuracy, completeness, groundedness / hallucination rate, relevance) and Operational metrics values like cost, latency and speed. This project provides dashboards that should allow the admin to determine the best combinations of LLM models used and settings for their data and organisation constraints. This is an ever improving area of RAGs, including this project.

Tech Stack

Backend: Python, FastAPI, Redis, Celery (async processing)
RAG Pipeline: Docling, LlamaIndex
Vector DB: ChromaDB
Search: Hybrid (BM25 + Vector + RRF)
LLM: Ollama (local) or cloud providers (OpenAI, Anthropic, etc.)
Infrastructure: Docker compose

Requirements

Docker - Docker Desktop, OrbStack, or Podman
Ollama - For local AI models (optional if using cloud only)
4GB RAM - For local development with slow inference.
2GB disk - For models and data for development.

Status

This is a development/research project, not production-ready software. It lacks authentication, enterprise security, monitoring, high availability features to name some main ones.

AI Development

This project is developed using Claude Code (Anthropic) as the primary coding assitant. OpenAI GPT and Google Gemini models are also used to explore alternative implementations.

All code is reviewed, tested (TDD), and validated for correctness and security.

Quick Start

1. Install Prerequisites

macOS:

# Install Ollama
brew install ollama

# Download AI models
ollama pull gemma3:4b
ollama pull nomic-embed-text

# Install Docker
brew install orbstack            # or Docker Desktop if you prefer

Linux:

# Install Ollama
curl -fsSL https://ollama.com/install.sh | sh

# Download AI models
ollama pull gemma3:4b
ollama pull nomic-embed-text

2. Download and Configure

# Clone the repository
git clone https://github.com/gittycat/ragbench.git
cd ragbench

3. Configure

Review and edit config.yml for your preferences.

Also edit secrets/.env with your API KEYS if using cloud models.

cp secrets/.env.example secrets/.env

4. Start the Application

# Start Ollama (if not already running)
ollama serve &

# Start RAG Lab
docker compose up -d

Open http://localhost:8000 in your browser.

5. Stop the Application

docker compose down

Usage

Upload Documents

Go to Documents page
Click Upload and select files
Wait for processing (progress bar shows status)

Supported formats: PDF, DOCX, PPTX, XLSX, TXT, Markdown, HTML, AsciiDoc

Ask Questions

Go to Chat page
Type your question
Get AI-powered answers with source citations

Manage Documents

View all uploaded documents in Documents page
Delete documents you no longer need
Start new chat sessions anytime

Reset Everything

docker compose down -v
docker compose up -d

Development

For development setup, testing, and technical documentation, see DEVELOPMENT.md.

License

Built on the shoulder of a multitude of great open source software. MIT License

Name		Name	Last commit message	Last commit date
Latest commit History 176 Commits
.claude		.claude
.forgejo/workflows		.forgejo/workflows
docs		docs
sample_documents		sample_documents
scripts		scripts
secrets		secrets
services		services
.gitignore		.gitignore
.mcp.json		.mcp.json
.python-version		.python-version
CLAUDE.md		CLAUDE.md
DEVELOPMENT.md		DEVELOPMENT.md
EVALS_README.md		EVALS_README.md
FRONT_END.md		FRONT_END.md
LICENSE.md		LICENSE.md
Makefile		Makefile
PROJECT_BOOTSTRAP.md		PROJECT_BOOTSTRAP.md
README.md		README.md
config.yml		config.yml
docker-compose.bench.yml		docker-compose.bench.yml
docker-compose.ci.yml		docker-compose.ci.yml
docker-compose.cloud.yml		docker-compose.cloud.yml
docker-compose.local.yml		docker-compose.local.yml
docker-compose.yml		docker-compose.yml
justfile		justfile
redis.conf		redis.conf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Data Privacy

Observability

Tech Stack

Requirements

Status

AI Development

Quick Start

1. Install Prerequisites

2. Download and Configure

3. Configure

4. Start the Application

5. Stop the Application

Usage

Upload Documents

Ask Questions

Manage Documents

Reset Everything

Development

License

About

Uh oh!

Releases

Packages

Languages

License

gittycat/ragbench

Folders and files

Latest commit

History

Repository files navigation

About

Data Privacy

Observability

Tech Stack

Requirements

Status

AI Development

Quick Start

1. Install Prerequisites

2. Download and Configure

3. Configure

4. Start the Application

5. Stop the Application

Usage

Upload Documents

Ask Questions

Manage Documents

Reset Everything

Development

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages