Skip to content

ishanb18/RAG-PDF-QNA

ย 
ย 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

3 Commits
ย 
ย 
ย 
ย 

Repository files navigation

๐Ÿ“˜ RAG-PDF-QnA โ€” Retrieval-Augmented Generation for Intelligent PDF Question Answering

This project implements a Retrieval-Augmented Generation (RAG) pipeline that enables users to upload PDF documents and interactively ask context-aware questions.
The system retrieves relevant sections from the document using vector embeddings and generates accurate, human-like answers using LLMs (Large Language Models).


๐Ÿš€ Features

  • ๐Ÿ“„ Upload any PDF and ask natural-language questions.
  • ๐Ÿง  Combines retrieval-based search with generative AI for precise answers.
  • ๐Ÿ” Uses FAISS vector database for semantic search.
  • ๐Ÿงฉ Built with LangChain, OpenAI embeddings, and FastAPI for modular, real-time interaction.
  • ๐Ÿ’ฌ Supports both CLI and web API usage.

๐Ÿ—๏ธ Tech Stack & Architecture

Layer Technology Purpose
Frontend / Interface Streamlit or FastAPI UI File upload, question input
Backend Framework FastAPI Manages routes, queries, and response handling
Document Processing PyPDF2 / LangChain Extracts and chunks text from PDFs
Vector Storage FAISS Stores document embeddings for semantic retrieval
Embeddings Model OpenAI text-embedding-ada-002 Converts document chunks into dense vectors
LLM for Generation GPT-3.5 / GPT-4 (via OpenAI API) Generates context-based answers
Pipeline Control LangChain Orchestrates retrieval + generation workflow

๐Ÿ”„ Workflow Overview

๐Ÿ“ PDF Upload
   โ†“
๐Ÿ“„ Text Extraction โ†’ Chunking โ†’ Embedding Generation
   โ†“
๐Ÿงฎ Embeddings stored in FAISS Vector DB
   โ†“
โ“ User Question
   โ†“
๐Ÿ” Retrieve top relevant chunks (semantic search)
   โ†“
๐Ÿ’ฌ Combine retrieved context + user query
   โ†“
๐Ÿค– Generate final answer using OpenAI LLM
   โ†“
๐Ÿ“ค Return answer to user

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%