Skip to content

thebinij/RAG-project-learning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

17 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

RAG Project Learning - FastAPI AI Assistant with Vector Search

A powerful Retrieval-Augmented Generation (RAG) system built with FastAPI and open-source technologies, featuring an interactive chat interface that can answer questions based on your document knowledge base.

๐Ÿš€ Features

  • FastAPI Backend: Modern, fast, async-first web framework with automatic API documentation
  • Interactive Chat Interface: Web-based chat UI with streaming responses
  • Vector Database: ChromaDB for efficient semantic search
  • Document Processing: Automatic chunking and embedding of documents
  • Semantic Search: Sentence transformers for intelligent document retrieval
  • API Documentation: Interactive Swagger UI and ReDoc with configurable endpoints
  • Production Ready: Structured logging, health checks, Docker deployment
  • Open Source: Built entirely with open-source technologies to avoid vendor lock-in

๐Ÿ—๏ธ Project Structure

rag-project-learning/
โ”œโ”€โ”€ app/                          # Main FastAPI package
โ”‚   โ”œโ”€โ”€ __init__.py              # Package initialization
โ”‚   โ”œโ”€โ”€ app.py                   # FastAPI application definition
โ”‚   โ”œโ”€โ”€ api/                     # API routers
โ”‚   โ”‚   โ”œโ”€โ”€ v1/                  # API version 1
โ”‚   โ”‚   โ”‚   โ””โ”€โ”€ chat.py         # Chat endpoints
โ”‚   โ”‚   โ””โ”€โ”€ visualizer/          # Visualizer endpoints
โ”‚   โ”‚       โ””โ”€โ”€ routes.py
โ”‚   โ”œโ”€โ”€ core/                    # Core configuration and engines
โ”‚   โ”‚   โ”œโ”€โ”€ config.py            # Settings management
โ”‚   โ”‚   โ”œโ”€โ”€ logging.py           # Logging configuration
โ”‚   โ”‚   โ””โ”€โ”€ engines/             # Core engine modules
โ”‚   โ”‚       โ”œโ”€โ”€ __init__.py
โ”‚   โ”‚       โ”œโ”€โ”€ vector_engine.py # Vector database operations
โ”‚   โ”‚       โ”œโ”€โ”€ chat_engine.py   # RAG chat engine
โ”‚   โ”‚       โ””โ”€โ”€ document_processor.py # Document processing
โ”‚   โ”œโ”€โ”€ schemas/                 # Pydantic models
โ”‚   โ”‚   โ”œโ”€โ”€ chat.py             # Chat schemas
โ”‚   โ”‚   โ””โ”€โ”€ visualizer.py       # Visualizer schemas
โ”‚   โ”œโ”€โ”€ services/                # Business logic layer
โ”‚   โ”‚   โ”œโ”€โ”€ vector_service.py    # Vector database operations
โ”‚   โ”‚   โ”œโ”€โ”€ chat_service.py      # Chat/RAG operations
โ”‚   โ”‚   โ””โ”€โ”€ document_service.py  # Document processing
โ”‚   โ””โ”€โ”€ scripts/                 # Utility scripts
โ”‚       โ”œโ”€โ”€ ingest_documents.py  # Document ingestion script
โ”‚       โ””โ”€โ”€ init_vectordb.py    # Vector database initialization
โ”œโ”€โ”€ static/                      # Frontend assets (CSS, JavaScript)
โ”‚   โ”œโ”€โ”€ css/
โ”‚   โ””โ”€โ”€ js/
โ”œโ”€โ”€ templates/                   # HTML templates
โ”œโ”€โ”€ data/                        # Data storage
โ”‚   โ”œโ”€โ”€ knowledge-docs/         # Document storage
โ”‚   โ”œโ”€โ”€ vector_db/              # Vector database
โ”‚   โ””โ”€โ”€ costs/                  # Cost tracking database
โ”œโ”€โ”€ chroma_db/                   # Vector database storage (auto-generated, gitignored)
โ”œโ”€โ”€ main.py                      # Entry point (python main.py)
โ”œโ”€โ”€ start_production.py          # Production entry point
โ”œโ”€โ”€ Pipfile                      # Python dependencies
โ”œโ”€โ”€ Pipfile.lock                 # Locked dependency versions
โ”œโ”€โ”€ Dockerfile                   # Production Docker image
โ”œโ”€โ”€ docker-compose.yml           # Docker Compose for deployment
โ”œโ”€โ”€ .dockerignore                # Docker build exclusions
โ””โ”€โ”€ README.md                    # Project documentation

๐Ÿ› ๏ธ Prerequisites

  • Python 3.12 (as specified in Pipfile)
  • pipenv for dependency management
  • Docker and Docker Compose for deployment (optional)
  • Git for version control

๐Ÿ“ฆ Installation & Setup

1. Clone the Repository

git clone <your-repo-url>
cd rag-project-learning

2. Install pipenv (if not already installed)

# On macOS/Linux
pip install pipenv

# On Windows
pip install pipenv

3. Install Dependencies

pipenv install

This will install all required packages:

  • fastapi - Modern, fast web framework
  • uvicorn[standard] - ASGI server with production features
  • chromadb - Vector database for embeddings
  • sentence-transformers - Text embedding models
  • openai - OpenAI API integration
  • structlog - Structured logging
  • rich - Rich console output
  • pydantic - Data validation and settings management

4. Activate Virtual Environment

pipenv shell

๐Ÿš€ Running the Application

Development Mode

# Run with auto-reload
python main.py

# Or run as a module
python -m app

# Or use uvicorn directly
uvicorn app.app:app --reload --host 0.0.0.0 --port 5252

Production Mode

# Use production startup script
python start_production.py

# Or run production module directly
python -m app.production

# Or use uvicorn with production settings
uvicorn app.app:app --host 0.0.0.0 --port 5252 --workers 1 --log-level info

Docker Deployment

# Build and run with Docker Compose
docker-compose up --build

# Or build and run manually
docker build -t legendarycorp-ai-assistant .
docker run -p 5252:5252 -e OPENAI_API_KEY=your_key legendarycorp-ai-assistant

๐Ÿงน Project Organization

The project follows a clean, modular FastAPI structure:

  • app/ - Main application package (Python code only)
    • app/app.py - FastAPI application definition
    • app/api/ - API endpoints and routers
    • app/core/ - Configuration, logging, and core engines
    • app/services/ - Business logic layer
    • app/schemas/ - Pydantic data models
    • app/scripts/ - Utility scripts for document processing
  • static/ - Frontend assets (CSS, JavaScript)
  • templates/ - HTML templates

Running Scripts

# Document ingestion
cd app/scripts
python ingest_documents.py

# Vector database initialization
python init_vectordb.py

๐Ÿ”ง Configuration

Environment Variables

Create a .env file in the project root by copying the example file:

cp .env.example .env

Then edit the .env file and update the values according to your environment. The most important variables to set are:

  • OPENAI_API_KEY - Your OpenAI API key (required for chat functionality)
  • DEBUG - Set to false in production
  • LOG_LEVEL - Set to INFO or WARNING in production

Note: Never commit .env files to version control. See .env.example for the complete list of available environment variables.

Production Settings

The application automatically detects production vs development environments:

  • Development: Auto-reload, debug logging, single worker
  • Production: No reload, structured logging, multiple workers, health checks

๐ŸŒ Application URLs

Once running, access the application at:

  • Chat Interface: http://localhost:5252/ - Main AI chat interface
  • API Documentation: http://localhost:5252/redoc - Interactive API docs (ReDoc) โš™๏ธ
  • Health Check: http://localhost:5252/health - Application health status
  • ChromaDB Visualizer: http://localhost:5252/visualizer - Database visualization dashboard

โš™๏ธ Configurable: ReDoc is enabled by default and can be disabled via ENABLE_REDOC environment variable

๐Ÿณ Docker Deployment

Quick Start

# Start with Docker Compose
docker-compose up --build

# View logs
docker-compose logs -f ai-assistant

# Stop services
docker-compose down

Production Deployment

# Build production image
docker build -t legendarycorp-ai-assistant:latest .

# Run with production settings
docker run -d \
  --name ai-assistant \
  -p 5252:5252 \
  -e LOG_LEVEL=INFO \
  -e CHROMA_DB_VISUALIZER=true \
  -e OPENAI_API_KEY=your_key \
  -v $(pwd)/chroma_db:/app/chroma_db \
  -v $(pwd)/data/knowledge-docs:/app/data/knowledge-docs \
  legendarycorp-ai-assistant:latest

Docker Features

  • Multi-stage build for optimized image size
  • Non-root user for security
  • Health checks for monitoring
  • Volume mounting for persistent data
  • Environment variable configuration
  • Production-ready uvicorn settings

๐Ÿ“Š Monitoring & Logging

Structured Logging

The application uses structlog for production-grade logging:

import structlog

logger = structlog.get_logger()
logger.info("Application started", port=5252, environment="production")

Health Checks

# Check application health
curl http://localhost:5252/health

# Response
{
  "status": "healthy",
  "timestamp": "2024-01-01T12:00:00Z"
}

Metrics

  • Request/response times
  • Error rates
  • Memory usage
  • ChromaDB statistics

๐Ÿงช Testing

Run Tests

# Install dev dependencies
pipenv install --dev

# Run all tests
pytest

# Run with coverage
pytest --cov=core --cov-report=html

API Testing

# Test chat endpoint
curl -X POST "http://localhost:5252/api/v1/chat" \
  -H "Content-Type: application/json" \
  -d '{"message": "What is the company policy on remote work?"}'

# Test streaming endpoint
curl -X POST "http://localhost:5252/api/v1/chat/stream" \
  -H "Content-Type: application/json" \
  -d '{"message": "Tell me about employee benefits"}'

๐Ÿš€ Performance & Scaling

Async by Default

FastAPI provides excellent performance with async/await:

  • Concurrent requests handling
  • Non-blocking I/O operations
  • Efficient streaming responses
  • WebSocket support for real-time features

Production Optimizations

  • Multiple workers with uvicorn
  • Connection pooling for databases
  • Caching strategies for embeddings
  • Load balancing ready

Scaling Options

# Scale with multiple workers
uvicorn app:app --host 0.0.0.0 --port 5252 --workers 4

# Use Gunicorn for more control
gunicorn app:app -w 4 -k uvicorn.workers.UvicornWorker --bind 0.0.0.0:5252

๐Ÿ”’ Security Features

  • CORS middleware configuration
  • Input validation with Pydantic
  • Rate limiting ready
  • Authentication ready (can be added)
  • HTTPS support

๐Ÿ“ˆ Production Checklist

  • Set DEBUG=false
  • Configure LOG_LEVEL=INFO or higher
  • Set proper CORS_ORIGINS
  • Use production database
  • Docker: Use production Dockerfile
  • Monitoring: Enable health checks
  • Logging: Configure structured logging
  • Security: Review CORS and authentication

๐Ÿ› Troubleshooting

Common Issues

  1. Port conflicts: Change port in .env or Docker configuration
  2. Memory issues: Reduce worker count or increase container memory
  3. ChromaDB errors: Check volume permissions and database initialization
  4. Logging issues: Verify LOG_LEVEL environment variable

Debug Mode

# Enable debug logging
export LOG_LEVEL=DEBUG
python app.py

Docker Debugging

# View container logs
docker-compose logs -f ai-assistant

# Access container shell
docker-compose exec ai-assistant bash

# Check container health
docker inspect legendarycorp-ai-assistant

๐Ÿ”ฎ Future Enhancements

  • Authentication & Authorization
  • Rate Limiting
  • Redis Caching
  • Database Migrations
  • Kubernetes Deployment
  • Prometheus Metrics
  • Grafana Dashboards
  • CI/CD Pipeline

๐Ÿ“„ License

This project is open source. Please check the license file for specific terms.

๐Ÿค Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Add tests
  5. Submit a pull request

๐Ÿ“ž Support

For issues and questions:

  1. Check the troubleshooting section
  2. Review existing issues
  3. Create a new issue with detailed information

Built with โค๏ธ using FastAPI and open-source technologies

About

RAG Project

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published