🤖 RAG Conversational AI and Evaluation Framework 📊

This project combines a Retrieval-Augmented Generation (RAG) Conversational AI system with a robust Evaluation Framework. The RAG system integrates Cohere's LLM, MongoDB for storage, and LangChain for intelligent query processing, while the Evaluation Framework enables benchmarking with synthetic test sets, performance metrics, and detailed visualizations.

Features

Conversational AI

Document Ingestion: Processes JSON files, generates embeddings using Cohere, and stores them in MongoDB.
Dynamic Query Routing: Automatically decides whether to use RAG or a pure chat model based on query relevance.
Context Management: Maintains conversation history and handles token-aware context truncation.
Error Handling: Implements retry mechanisms for MongoDB connections and logs errors comprehensively.

Evaluation Framework

Synthetic Testset Generation: Creates test queries with ground truths using GPT-based models.
Performance Metrics: Evaluates precision, recall, faithfulness, relevancy, and response times using RAGAS.
Visualization: Generates radar charts, histograms, confusion matrices, and execution time distributions.
PDF Reporting: Produces detailed performance reports with improvement suggestions.

Setup Instructions

Prerequisites

Python 3.8 or higher
MongoDB Atlas account
Cohere API key

1. Clone the Repository

git clone <your_github_repo_url>
cd RAG-Conversational-AI

2. Install Dependencies

pip install -r requirements.txt

3. Create a `.env` File

Create a .env file in the root directory of your project with following the setup in .env.example

4. Ingest Documents into MongoDB

python ingest_docs.py

5. Start the RAG Conversational AI Application

python main.py

Eval Framework Instructions

1. Generate Synthetic Testset

python synthetic_testset.py

Uses the synthetic_testset.py script to create test queries with ground truths

2. Run Evaluation

python synthetic_eval_script.py

Evaluates the system's performance using by using metrics such as such as precision, recall, faithfulness, relevancy, cost, and response time.

3. Generate Benchmarking PDF Report

python evaluation_report.py

Creates a detailed performance report with visualizations including radar charts, histograms, confusion matrices, and improvement suggestions.

Usage

Ask Questions: Enter your query when prompted. The system will:
- Reshape the question if needed.
- Decide whether to use RAG (retrieval-based) or Chat (LLM-only) mode.
- Generate a response based on the selected route.
Run Evaluations Select the evaluation option from the menu to test performance metrics.
Exit: Choose the exit option when you're done.

System Architecture

Document Ingestion Pipeline:

Reads JSONL documents.
Generates embeddings using Cohere's API.
Stores content and embeddings in MongoDB.

Query Processing Flow:

Question Reshaping Decision: Determines if the query needs additional context from conversation history.
Standalone Question Generation: Reformulates queries into standalone questions if necessary.
Inner Router Decision: Uses cosine similarity to decide between RAG or Chat mode.
Response Generation:
- RAG Mode: Retrieves relevant documents and generates responses based on them.
- Chat Mode: Generates responses directly using Cohere's LLM.

Performance Metrics

The evaluation framework uses the following metrics:

Context Precision: Measures how much of the retrieved context is relevant.
Context Recall: Measures how much relevant information is retrieved.
Faithfulness: Measures how well the answer aligns with the provided context.
Context Relevancy: Measures how relevant the retrieved context is to the question.
Answer Relevancy: Measures how relevant the generated answer is to the question.

Visualization & Reporting

The evaluation framework provides:

Radar charts for median scores across key metrics.
Histograms for metric distributions (e.g., precision, recall).
Confusion matrices showing retrieval and answer accuracy rates.
Execution time distribution histograms for performance analysis.
PDF reports summarizing all results with actionable insights.

Technical Details

MongoDB Integration:

Stores both documents and chat logs in separate collections.
Implements retry mechanisms with exponential backoff for connection stability.

Performance Features:

Token-aware context truncation ensures queries stay within token limits.
Conversation history is maintained for up to five previous interactions.

Error Handling:

Comprehensive logging system tracks all operations.
Graceful degradation ensures smooth operation even during failures.

Dependencies

The following Python libraries are required:

cohere: For embeddings and language model interaction.
pymongo: For MongoDB integration.
transformers: For tokenization and context management.
ragas: For evaluation metrics and testset generation.
reportlab: For generating PDF reports.
matplotlib & pandas: For data visualization and analysis.

Install them via pip install -r requirements.txt.

Example Workflow

User enters a query: "What is the capital of France?"
The system checks if reshaping is needed (e.g., based on prior context).
If relevant documents are found in MongoDB, it uses RAG mode to generate an answer like "The capital of France is Paris."
If no relevant documents are found, it switches to Chat mode and generates an answer using Cohere's LLM.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🤖 RAG Conversational AI and Evaluation Framework 📊

Features

Conversational AI

Evaluation Framework

Setup Instructions

Prerequisites

1. Clone the Repository

2. Install Dependencies

3. Create a `.env` File

4. Ingest Documents into MongoDB

5. Start the RAG Conversational AI Application

Eval Framework Instructions

1. Generate Synthetic Testset

2. Run Evaluation

3. Generate Benchmarking PDF Report

Usage

System Architecture

Document Ingestion Pipeline:

Query Processing Flow:

Performance Metrics

Visualization & Reporting

Technical Details

MongoDB Integration:

Performance Features:

Error Handling:

Dependencies

Example Workflow

License

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
Documents		Documents
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
evaluation_report.py		evaluation_report.py
ingest_docs.py		ingest_docs.py
main.py		main.py
params.py		params.py
rag_app.py		rag_app.py
requirements.txt		requirements.txt
synthetic_eval_script.py		synthetic_eval_script.py
synthetic_testset.py		synthetic_testset.py

License

danieldsouza13/RAG-Conversational-AI

Folders and files

Latest commit

History

Repository files navigation

🤖 RAG Conversational AI and Evaluation Framework 📊

Features

Conversational AI

Evaluation Framework

Setup Instructions

Prerequisites

1. Clone the Repository

2. Install Dependencies

3. Create a .env File

4. Ingest Documents into MongoDB

5. Start the RAG Conversational AI Application

Eval Framework Instructions

1. Generate Synthetic Testset

2. Run Evaluation

3. Generate Benchmarking PDF Report

Usage

System Architecture

Document Ingestion Pipeline:

Query Processing Flow:

Performance Metrics

Visualization & Reporting

Technical Details

MongoDB Integration:

Performance Features:

Error Handling:

Dependencies

Example Workflow

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

3. Create a `.env` File

Packages