A powerful Retrieval-Augmented Generation (RAG) system built with FastAPI and open-source technologies, featuring an interactive chat interface that can answer questions based on your document knowledge base.
- FastAPI Backend: Modern, fast, async-first web framework with automatic API documentation
- Interactive Chat Interface: Web-based chat UI with streaming responses
- Vector Database: ChromaDB for efficient semantic search
- Document Processing: Automatic chunking and embedding of documents
- Semantic Search: Sentence transformers for intelligent document retrieval
- API Documentation: Interactive Swagger UI and ReDoc with configurable endpoints
- Production Ready: Structured logging, health checks, Docker deployment
- Open Source: Built entirely with open-source technologies to avoid vendor lock-in
rag-project-learning/
โโโ app/ # Main FastAPI package
โ โโโ __init__.py # Package initialization
โ โโโ app.py # FastAPI application definition
โ โโโ api/ # API routers
โ โ โโโ v1/ # API version 1
โ โ โ โโโ chat.py # Chat endpoints
โ โ โโโ visualizer/ # Visualizer endpoints
โ โ โโโ routes.py
โ โโโ core/ # Core configuration and engines
โ โ โโโ config.py # Settings management
โ โ โโโ logging.py # Logging configuration
โ โ โโโ engines/ # Core engine modules
โ โ โโโ __init__.py
โ โ โโโ vector_engine.py # Vector database operations
โ โ โโโ chat_engine.py # RAG chat engine
โ โ โโโ document_processor.py # Document processing
โ โโโ schemas/ # Pydantic models
โ โ โโโ chat.py # Chat schemas
โ โ โโโ visualizer.py # Visualizer schemas
โ โโโ services/ # Business logic layer
โ โ โโโ vector_service.py # Vector database operations
โ โ โโโ chat_service.py # Chat/RAG operations
โ โ โโโ document_service.py # Document processing
โ โโโ scripts/ # Utility scripts
โ โโโ ingest_documents.py # Document ingestion script
โ โโโ init_vectordb.py # Vector database initialization
โโโ static/ # Frontend assets (CSS, JavaScript)
โ โโโ css/
โ โโโ js/
โโโ templates/ # HTML templates
โโโ data/ # Data storage
โ โโโ knowledge-docs/ # Document storage
โ โโโ vector_db/ # Vector database
โ โโโ costs/ # Cost tracking database
โโโ chroma_db/ # Vector database storage (auto-generated, gitignored)
โโโ main.py # Entry point (python main.py)
โโโ start_production.py # Production entry point
โโโ Pipfile # Python dependencies
โโโ Pipfile.lock # Locked dependency versions
โโโ Dockerfile # Production Docker image
โโโ docker-compose.yml # Docker Compose for deployment
โโโ .dockerignore # Docker build exclusions
โโโ README.md # Project documentation
- Python 3.12 (as specified in Pipfile)
- pipenv for dependency management
- Docker and Docker Compose for deployment (optional)
- Git for version control
git clone <your-repo-url>
cd rag-project-learning# On macOS/Linux
pip install pipenv
# On Windows
pip install pipenvpipenv installThis will install all required packages:
fastapi- Modern, fast web frameworkuvicorn[standard]- ASGI server with production featureschromadb- Vector database for embeddingssentence-transformers- Text embedding modelsopenai- OpenAI API integrationstructlog- Structured loggingrich- Rich console outputpydantic- Data validation and settings management
pipenv shell# Run with auto-reload
python main.py
# Or run as a module
python -m app
# Or use uvicorn directly
uvicorn app.app:app --reload --host 0.0.0.0 --port 5252# Use production startup script
python start_production.py
# Or run production module directly
python -m app.production
# Or use uvicorn with production settings
uvicorn app.app:app --host 0.0.0.0 --port 5252 --workers 1 --log-level info# Build and run with Docker Compose
docker-compose up --build
# Or build and run manually
docker build -t legendarycorp-ai-assistant .
docker run -p 5252:5252 -e OPENAI_API_KEY=your_key legendarycorp-ai-assistantThe project follows a clean, modular FastAPI structure:
app/- Main application package (Python code only)app/app.py- FastAPI application definitionapp/api/- API endpoints and routersapp/core/- Configuration, logging, and core enginesapp/services/- Business logic layerapp/schemas/- Pydantic data modelsapp/scripts/- Utility scripts for document processing
static/- Frontend assets (CSS, JavaScript)templates/- HTML templates
# Document ingestion
cd app/scripts
python ingest_documents.py
# Vector database initialization
python init_vectordb.pyCreate a .env file in the project root by copying the example file:
cp .env.example .envThen edit the .env file and update the values according to your environment. The most important variables to set are:
OPENAI_API_KEY- Your OpenAI API key (required for chat functionality)DEBUG- Set tofalsein productionLOG_LEVEL- Set toINFOorWARNINGin production
Note: Never commit
.envfiles to version control. See.env.examplefor the complete list of available environment variables.
The application automatically detects production vs development environments:
- Development: Auto-reload, debug logging, single worker
- Production: No reload, structured logging, multiple workers, health checks
Once running, access the application at:
- Chat Interface:
http://localhost:5252/- Main AI chat interface - API Documentation:
http://localhost:5252/redoc- Interactive API docs (ReDoc) โ๏ธ - Health Check:
http://localhost:5252/health- Application health status - ChromaDB Visualizer:
http://localhost:5252/visualizer- Database visualization dashboard
โ๏ธ Configurable: ReDoc is enabled by default and can be disabled via ENABLE_REDOC environment variable
# Start with Docker Compose
docker-compose up --build
# View logs
docker-compose logs -f ai-assistant
# Stop services
docker-compose down# Build production image
docker build -t legendarycorp-ai-assistant:latest .
# Run with production settings
docker run -d \
--name ai-assistant \
-p 5252:5252 \
-e LOG_LEVEL=INFO \
-e CHROMA_DB_VISUALIZER=true \
-e OPENAI_API_KEY=your_key \
-v $(pwd)/chroma_db:/app/chroma_db \
-v $(pwd)/data/knowledge-docs:/app/data/knowledge-docs \
legendarycorp-ai-assistant:latest- Multi-stage build for optimized image size
- Non-root user for security
- Health checks for monitoring
- Volume mounting for persistent data
- Environment variable configuration
- Production-ready uvicorn settings
The application uses structlog for production-grade logging:
import structlog
logger = structlog.get_logger()
logger.info("Application started", port=5252, environment="production")# Check application health
curl http://localhost:5252/health
# Response
{
"status": "healthy",
"timestamp": "2024-01-01T12:00:00Z"
}- Request/response times
- Error rates
- Memory usage
- ChromaDB statistics
# Install dev dependencies
pipenv install --dev
# Run all tests
pytest
# Run with coverage
pytest --cov=core --cov-report=html# Test chat endpoint
curl -X POST "http://localhost:5252/api/v1/chat" \
-H "Content-Type: application/json" \
-d '{"message": "What is the company policy on remote work?"}'
# Test streaming endpoint
curl -X POST "http://localhost:5252/api/v1/chat/stream" \
-H "Content-Type: application/json" \
-d '{"message": "Tell me about employee benefits"}'FastAPI provides excellent performance with async/await:
- Concurrent requests handling
- Non-blocking I/O operations
- Efficient streaming responses
- WebSocket support for real-time features
- Multiple workers with uvicorn
- Connection pooling for databases
- Caching strategies for embeddings
- Load balancing ready
# Scale with multiple workers
uvicorn app:app --host 0.0.0.0 --port 5252 --workers 4
# Use Gunicorn for more control
gunicorn app:app -w 4 -k uvicorn.workers.UvicornWorker --bind 0.0.0.0:5252- CORS middleware configuration
- Input validation with Pydantic
- Rate limiting ready
- Authentication ready (can be added)
- HTTPS support
- Set
DEBUG=false - Configure
LOG_LEVEL=INFOor higher - Set proper
CORS_ORIGINS - Use production database
- Docker: Use production Dockerfile
- Monitoring: Enable health checks
- Logging: Configure structured logging
- Security: Review CORS and authentication
- Port conflicts: Change port in
.envor Docker configuration - Memory issues: Reduce worker count or increase container memory
- ChromaDB errors: Check volume permissions and database initialization
- Logging issues: Verify
LOG_LEVELenvironment variable
# Enable debug logging
export LOG_LEVEL=DEBUG
python app.py# View container logs
docker-compose logs -f ai-assistant
# Access container shell
docker-compose exec ai-assistant bash
# Check container health
docker inspect legendarycorp-ai-assistant- Authentication & Authorization
- Rate Limiting
- Redis Caching
- Database Migrations
- Kubernetes Deployment
- Prometheus Metrics
- Grafana Dashboards
- CI/CD Pipeline
This project is open source. Please check the license file for specific terms.
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests
- Submit a pull request
For issues and questions:
- Check the troubleshooting section
- Review existing issues
- Create a new issue with detailed information
Built with โค๏ธ using FastAPI and open-source technologies