A powerful PDF-based Retrieval Augmented Generation (RAG) API built with FastAPI, LangChain, and Google Gemini
Features β’ Quick Start β’ API Documentation β’ Development β’ Deployment
- Overview
- Features
- Architecture
- Prerequisites
- Installation
- Quick Start
- API Documentation
- Configuration
- Project Structure
- Usage Examples
- Development
- Testing
- Deployment
- Contributing
- Troubleshooting
- License
- Acknowledgments
Querio is a production-ready RESTful API that enables intelligent question-answering over PDF documents using Retrieval Augmented Generation (RAG). Built on modern AI technologies, it provides a robust backend for building knowledge-based applications with natural language interfaces.
- π Fast & Efficient: Optimized vector search with ChromaDB
- π Production Ready: Complete with error handling, validation, and comprehensive API docs
- π¨ Frontend Ready: CORS-enabled REST API perfect for web/mobile frontends
- π Multi-Document: Process and query across multiple PDF documents
- π¬ Conversational: Session-based chat with context retention
- π Semantic Search: Advanced vector similarity search
- π Scalable: Built with async FastAPI for high performance
-
Document Management
- Upload single or multiple PDF files
- Automatic text extraction and processing
- Document metadata tracking
- Delete and manage documents
-
Intelligent Querying
- RAG-based question answering
- Context-aware responses using Google Gemini
- Configurable retrieval parameters
- Source attribution
-
Conversational AI
- Session-based chat conversations
- Message history tracking
- Multi-session support
- Context retention across messages
-
Semantic Search
- Vector similarity search
- Find similar content
- Retrieve relevant document chunks
- No LLM overhead for pure search
- β‘ Async API: Built with FastAPI for high concurrency
- π Auto Documentation: Interactive Swagger UI and ReDoc
- π Type Safety: Pydantic models for request/response validation
- π― Error Handling: Comprehensive exception handling and logging
- π CORS Support: Ready for browser-based frontends
- π Health Monitoring: Health check and statistics endpoints
- π Hot Reload: Development server with auto-reload
βββββββββββββββ ββββββββββββββββ βββββββββββββββ
β Client ββββββΆ β FastAPI ββββββΆ β Google β
β Application β β REST API β β Gemini AI β
βββββββββββββββ ββββββββββββββββ βββββββββββββββ
β
βΌ
ββββββββββββββββ
β Document β
β Service β
ββββββββββββββββ
β
βΌ
ββββββββββββββββ βββββββββββββββ
β Vector ββββββΆ β ChromaDB β
β Store β β (Local) β
ββββββββββββββββ βββββββββββββββ
β
βΌ
ββββββββββββββββ
β HuggingFace β
β Embeddings β
ββββββββββββββββ
- Framework: FastAPI 0.115.6
- LLM: Google Gemini Pro
- Embeddings: HuggingFace (sentence-transformers)
- Vector DB: ChromaDB 0.5.23
- PDF Processing: PyMuPDF (fitz)
- Text Splitting: LangChain Text Splitters
- Orchestration: LangChain 0.3.13
- Python: 3.10 or higher (3.11+ recommended)
- RAM: Minimum 4GB (8GB+ recommended for large documents)
- Storage: ~2GB for dependencies + space for documents and vector DB
- Google AI API Key: Get from Google AI Studio
git clone https://github.com/paradocx96/querio.git
cd querio# Using venv
python -m venv querio-venv
# Activate on Windows
querio-venv\Scripts\activate
# Activate on macOS/Linux
source querio-venv/bin/activatepip install -r requirements.txtCreate a .env file in the project root:
# .env
GENAI_API_KEY=your_google_api_key_herepython --version # Should be 3.10+
pip list | grep fastapi # Verify FastAPI is installedcd src
python app.pyThe server will start on http://localhost:8000
- Swagger UI: http://localhost:8000/docs
- ReDoc: http://localhost:8000/redoc
- Health Check: http://localhost:8000/api/health
-
Upload a PDF Document
curl -X POST "http://localhost:8000/api/documents/upload" \ -F "file=@your_document.pdf"
-
Process Documents
curl -X POST "http://localhost:8000/api/documents/process" -
Ask a Question
curl -X POST "http://localhost:8000/api/query" \ -H "Content-Type: application/json" \ -d '{"query": "What is the main topic?", "k": 3}'
http://localhost:8000
| Category | Endpoint | Method | Description |
|---|---|---|---|
| System | /api/health |
GET | Health check |
| System | /api/stats |
GET | System statistics |
| Documents | /api/documents/upload |
POST | Upload PDF |
| Documents | /api/documents |
GET | List all documents |
| Documents | /api/documents/{id} |
GET | Get document details |
| Documents | /api/documents/{id} |
DELETE | Delete document |
| Documents | /api/documents/process |
POST | Process documents |
| Query | /api/query |
POST | Ask a question |
| Chat | /api/chat |
POST | Send chat message |
| Chat | /api/chat/sessions |
GET | List chat sessions |
| Chat | /api/chat/sessions |
POST | Create session |
| Chat | /api/chat/sessions/{id} |
GET | Get session |
| Chat | /api/chat/sessions/{id} |
DELETE | Delete session |
| Chat | /api/chat/sessions/{id}/history |
GET | Get chat history |
| Search | /api/search |
POST | Semantic search |
| Search | /api/search/similar |
POST | Find similar content |
Visit http://localhost:8000/docs for:
- Try out endpoints directly
- See request/response schemas
- Test authentication
- View example payloads
For detailed API documentation, see API_DOCUMENTATION.md
| Variable | Required | Default | Description |
|---|---|---|---|
GENAI_API_KEY |
Yes | - | Google AI API key |
PDF_FOLDER |
No | data |
PDF storage directory |
CHROMA_DB |
No | chroma_db |
Vector DB directory |
Edit src/config.py to customize:
class Settings:
GENAI_API_KEY = os.getenv("GENAI_API_KEY")
PDF_FOLDER = "data"
CHROMA_DB = "chroma_db"Modify src/app.py for:
- CORS origins
- Server host/port
- API metadata
- Middleware
querio/
βββ src/
β βββ api/
β β βββ routes/
β β β βββ __init__.py
β β β βββ query.py # Query endpoints
β β β βββ chat.py # Chat endpoints
β β β βββ documents.py # Document management
β β β βββ search.py # Search endpoints
β β β βββ system.py # Health & stats
β β βββ __init__.py
β β βββ models.py # Pydantic models
β β βββ services.py # Business logic
β β βββ dependencies.py # Dependency injection
β βββ app.py # FastAPI application
β βββ config.py # Configuration
β βββ pdf_handler.py # PDF processing
β βββ text_splitter.py # Text chunking
β βββ vector_store.py # Vector DB operations
β βββ rag_pipeline.py # RAG logic
β βββ main.py # CLI interface (legacy)
βββ data/ # PDF storage
βββ chroma_db/ # Vector database
βββ requirements.txt # Dependencies
βββ .env # Environment variables
βββ README.md # This file
import requests
BASE_URL = "http://localhost:8000"
# Upload document
with open("document.pdf", "rb") as f:
files = {"file": f}
response = requests.post(f"{BASE_URL}/api/documents/upload", files=files)
print(f"Uploaded: {response.json()['filename']}")
# Process documents
response = requests.post(f"{BASE_URL}/api/documents/process")
print(f"Processed {response.json()['documents_processed']} documents")
# Query
data = {"query": "What is machine learning?", "k": 3}
response = requests.post(f"{BASE_URL}/api/query", json=data)
print(f"Answer: {response.json()['answer']}")
# Start a chat
data = {"message": "Hello, tell me about the documents"}
response = requests.post(f"{BASE_URL}/api/chat", json=data)
session_id = response.json()["session_id"]
print(f"Response: {response.json()['answer']}")
# Continue conversation
data = {"message": "Tell me more", "session_id": session_id}
response = requests.post(f"{BASE_URL}/api/chat", json=data)
print(f"Response: {response.json()['answer']}")// Upload document
const formData = new FormData();
formData.append('file', fileInput.files[0]);
const uploadResponse = await fetch('http://localhost:8000/api/documents/upload', {
method: 'POST',
body: formData
});
const uploadData = await uploadResponse.json();
console.log('Uploaded:', uploadData.filename);
// Process documents
await fetch('http://localhost:8000/api/documents/process', {
method: 'POST'
});
// Query
const queryResponse = await fetch('http://localhost:8000/api/query', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
query: 'What is AI?',
k: 3
})
});
const queryData = await queryResponse.json();
console.log('Answer:', queryData.answer);# Upload document
curl -X POST "http://localhost:8000/api/documents/upload" \
-F "file=@document.pdf"
# Process documents
curl -X POST "http://localhost:8000/api/documents/process"
# Query
curl -X POST "http://localhost:8000/api/query" \
-H "Content-Type: application/json" \
-d '{"query": "What is the main topic?", "k": 3}'
# Chat
curl -X POST "http://localhost:8000/api/chat" \
-H "Content-Type: application/json" \
-d '{"message": "Tell me about the documents"}'# Install in development mode
pip install -r requirements.txt
# Run with auto-reload
cd src
python app.py
# Or use uvicorn directly
uvicorn app:app --reload --host 0.0.0.0 --port 8000# Format code with black
black src/
# Sort imports
isort src/
# Lint with flake8
flake8 src/- Define Pydantic models in
src/api/models.py - Add business logic to
src/api/services.py - Create route file in
src/api/routes/ - Include router in
src/app.py
Example:
# src/api/routes/new_feature.py
from fastapi import APIRouter
router = APIRouter(prefix="/api/feature", tags=["Feature"])
@router.get("/")
async def get_feature():
return {"message": "New feature"}
# src/app.py
from api.routes import new_feature
app.include_router(new_feature.router)Use the interactive API documentation:
http://localhost:8000/docs
# Test health endpoint
curl http://localhost:8000/api/health
# Test stats
curl http://localhost:8000/api/stats
# Upload test document
curl -X POST http://localhost:8000/api/documents/upload \
-F "file=@test.pdf"# Install Apache Bench
apt-get install apache2-utils
# Test with 100 requests, 10 concurrent
ab -n 100 -c 10 http://localhost:8000/api/healthCreate Dockerfile:
FROM python:3.11-slim
WORKDIR /app
# Install dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
# Copy application
COPY src/ ./src/
COPY .env .
# Create data directories
RUN mkdir -p data chroma_db
EXPOSE 8000
# Run application
CMD ["uvicorn", "src.app:app", "--host", "0.0.0.0", "--port", "8000"]Build and run:
docker build -t querio-api .
docker run -p 8000:8000 -v $(pwd)/data:/app/data querio-apiCreate docker-compose.yml:
version: '3.8'
services:
api:
build: .
ports:
- "8000:8000"
volumes:
- ./data:/app/data
- ./chroma_db:/app/chroma_db
env_file:
- .env
restart: unless-stoppedRun with:
docker-compose up -dpip install gunicorn
gunicorn src.app:app \
--workers 4 \
--worker-class uvicorn.workers.UvicornWorker \
--bind 0.0.0.0:8000 \
--timeout 120Create /etc/systemd/system/querio.service:
[Unit]
Description=Querio API Service
After=network.target
[Service]
Type=simple
User=www-data
WorkingDirectory=/opt/querio
Environment="PATH=/opt/querio/venv/bin"
ExecStart=/opt/querio/venv/bin/uvicorn src.app:app --host 0.0.0.0 --port 8000
Restart=always
RestartSec=10
[Install]
WantedBy=multi-user.targetEnable and start:
sudo systemctl enable querio
sudo systemctl start querio
sudo systemctl status querioserver {
listen 80;
server_name api.yourdomain.com;
location / {
proxy_pass http://localhost:8000;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
}
}Contributions are welcome! Please follow these guidelines:
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
- Follow PEP 8 style guide
- Add docstrings to all functions
- Update documentation for new features
- Write meaningful commit messages
- Test your changes before submitting
- Be respectful and inclusive
- Provide constructive feedback
- Focus on what is best for the community
- Show empathy towards other contributors
Error: Vector store not initialized
Solution: Upload documents and run /api/documents/process
ModuleNotFoundError: No module named 'fastapi'
Solution: Ensure virtual environment is activated and dependencies are installed
pip install -r requirements.txtError: Failed to configure Google LLM
Solution: Check that GENAI_API_KEY is set in .env file
Error: Address already in use
Solution: Change port in app.py or kill process using port 8000
# Windows
netstat -ano | findstr :8000
taskkill /PID <PID> /F
# Linux/Mac
lsof -i :8000
kill -9 <PID>Failed to send telemetry event
Solution: These are harmless warnings. Update to latest version:
pip install -U langchain-chroma- π Check the API Documentation
- π¬ Open an Issue
- π§ Contact: navindadev@gmail.com
This project is licensed under the MIT License - see the LICENSE file for details.
MIT License
Copyright (c) 2025 Navinda Chandrasiri
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
- FastAPI - Modern web framework
- LangChain - LLM orchestration
- Google Gemini - Large language model
- ChromaDB - Vector database
- HuggingFace - Embeddings models
- PyMuPDF - PDF processing
This project was inspired by the need for a production-ready RAG API that could be easily integrated into any application requiring intelligent document querying.
- The FastAPI community for excellent documentation
- LangChain team for RAG abstractions
- Google for providing Gemini API
- All contributors and users of this project
Your Name
- GitHub: @paradocx96
- Email: navindadev@gmail.com
- LinkedIn: Profile
- π Report bugs via GitHub Issues
- π‘ Request features via GitHub Discussions
- π§ General inquiries: navindadev@gmail.com
Built with β€οΈ by paradocx96