This repository documents my journey of learning how to build a deployment-level project from scratch. It's an educational endeavor to understand the components of a Retrieval-Augmented Generation (RAG) system while honing essential software development skills.
This project is inspired by the incredible YouTube playlist by Abu Bark Soliman:
Following this playlist has been a fantastic learning experience, covering not just the technical implementation of RAG, but also crucial software engineering practices like:
- Design Patterns: MVC (Model-View-Controller), Factory Provider
- Code Organization: Structuring a project for scalability and maintainability
- Clean Coding Practices: Writing readable and efficient code
Disclaimer: This project is unfinished and is intended for learning purposes only. It is not production-ready.
The project follows a modular architecture, separating concerns to make the codebase clean and scalable.
src/
├── main.py # FastAPI application entry point
|
├── routes/ # API route definitions (the "View" in MVC)
│ ├── base.py
│ ├── data.py
│ └── nlp.py
|
├── controllers/ # Business logic and request handlers (the "Controller" in MVC)
│ ├── BaseController.py
│ ├── DataController.py
│ ├── NLPController.py
│ ├── ProcessController.py
│ └── ProjectController.py
|
├── models/ # Data models and database schemas (the "Model" in MVC)
│ ├── AssetModel.py
│ ├── ChunkModel.py
│ └── ProjectModel.py
|
├── stores/ # External service integrations (Factory Providers)
│ ├── llm/ # LLM provider implementations (OpenAI, Cohere)
│ └── vectordb/ # Vector database provider implementations (Qdrant)
|
└── helpers/ # Utility functions and configuration
└── config.py
- Document upload and processing (text and PDF files)
- Text chunking with configurable size and overlap
- Integration with OpenAI and Cohere for embeddings using a factory provider pattern
- MongoDB for document and vector storage
- RESTful API built with FastAPI
- Python 3.8 or later
- Docker and Docker Compose
- API keys for OpenAI and/or Cohere
-
Clone the repository:
git clone https://github.com/AbdoAlshoki2/mini-RAG-study.git cd mini-RAG-study -
Set up a virtual environment:
python -m venv venv source venv/bin/activate # On Windows: .\venv\Scripts\activate
-
Install dependencies:
pip install -r requirements.txt
-
Configure environment variables:
Copy the example
.envfile:cp .env.example .env
Then, edit the
.envfile with your API keys and settings.
-
Start MongoDB with Docker:
cd docker docker-compose up -d -
Run the FastAPI server:
cd src uvicorn main:app --reload --host 0.0.0.0 --port 5000 -
Access the API documentation:
- Swagger UI: http://localhost:5000/docs
- ReDoc: http://localhost:5000/redoc
- GET
/api/v1/: Welcome endpoint that returns the application name and version
-
POST
/api/v1/data/upload/{project_id}- Upload a text or PDF document for processing
- Parameters:
project_id: Project identifierfile(form-data): The file to upload
- Returns: File ID for reference
-
POST
/api/v1/data/process/{project_id}- Process uploaded documents into chunks
- Parameters:
project_id: Project identifierchunk_size(query): Size of each text chunk (default: 100)overlap_size(query): Overlap between chunks (default: 20)do_reset(query): Whether to reset existing chunks (default: false)file_id(query, optional): Specific file ID to process (processes all files if not provided)
-
POST
/api/v1/nlp/index/push/{project_id}- Index project documents into the vector database
- Parameters:
project_id: Project identifierdo_reset(body): Whether to reset existing index (default: false)
-
GET
/api/v1/nlp/index/info/{project_id}- Get information about the project's vector index
- Parameters:
project_id: Project identifier
-
POST
/api/v1/nlp/search/{project_id}- Search the vector database for relevant chunks
- Parameters:
project_id: Project identifierquery(body): Search querytop_k(body, optional): Number of results to return (default: 5)
-
POST
/api/v1/nlp/answer/{project_id}- Get an answer from the RAG system
- Parameters:
project_id: Project identifierquery(body): Question to answertop_k(body, optional): Number of context chunks to use (default: 5)