QueryPDF

QueryPDF is a local-first, terminal-based RAG tool that allows users to interact with PDF documents using natural language, entirely offline. It extracts text from user-specified pages, chunks it intelligently, embeds the content, and stores it in a local vector database. Then, using an LLM served by Ollama, it provides accurate, grounded responses to your queries — all without any internet connection.

🔍 Features

Privacy: All processing happens on your machine, no data leaves your system.
Local Caching: Caches both tokenizers and embedding models locally for offline use and faster future runs.
Interactive Page Selection: Specify which pages to analyze at runtime.
Graceful Exit Handling: Automatically cleans up ChromaDB collections on exit.
Visual Progress Feedback: Shows embedding generation progress.
Context-Aware Responses: Uses semantic search to find the most relevant chunks for your questions.
Fully Customizable: Users can configure embedding model, LLMs, and chunk parameters via a simple JSON file.

🛠️ Technologies Used

Technology	Purpose
pypdf	Extract text from PDF files
transformers	Tokenization using pretrained models
langchain	Text chunking with recursive character splitter
sentence-transformers	Generate dense vector embeddings
ChromaDB	Store and query embeddings locally
Ollama	Run open-source LLMs locally

✅ Requirements

Python 3.10+
Ollama
- Install Ollama and run:
```
ollama run gemma3:1b
```

📦 Installation

Create a virtual environment using:

python -m venv <environment-name>

Activate the virtual environment:

./<environment-name>/scripts/activate

Install the dependencies:

pip install pypdf transformers sentence-transformers langchain chromadb ollama colorama yaspin

Or using requirements.txt:

pip install -r requirements.txt

📂 Project Structure

project/
├── config.json         # Configuration file
├── app.py              # Main script
├── sample.pdf          # Example PDF for testing
├── requirements.txt    # Package dependencies
└── README.md           # Documentation

⚙️ Configuration

The application uses a config.json file with the following parameters:

{
    "embedding_model": "intfloat/e5-small-v2",
    "ollama_model": "gemma3:1b",
    "chunk_size": 450,
    "chunk_overlap": 100
}

You can modify these settings to adjust:

The embedding model used for semantic search (you can find embedding models on hugging face) .
The Ollama LLM model used for response generation as per your requirements and resources (you can find ollama models on Ollama github page) .
Make sure to install the required LLM model by running ollama run <model-name> in the terminal -- This installs the model locally .
Chunk size and overlap for text splitting as per the model .

▶️ Usage

Run the script:

python app.py

When prompted, paste the full path to your PDF file:

Paste the PDF path: /path/to/your/document.pdf

Enter the page range you want to analyze:

Enter the page range (e.g., 15-25): 10-20

The application will process the PDF, extracting text, generating embeddings, and storing them in ChromaDB.
Enter your questions when prompted:

Chat with PDF: What is the main topic discussed in the document?

To exit the application, type:

Chat with PDF: /exit

⚙️ How It Works

PDF Extraction: Extracts text from specific pages based on user input (you specify the page range during runtime).
Tokenization & Chunking: Uses the specified tokenizer to split the text into chunks of configurable size (default: 450 tokens) with customizable overlap (default: 100 tokens).
Embeddings: Creates embeddings for each chunk using SentenceTransformers (default: intfloat/e5-small-v2).
Vector Storage: Stores the embeddings and original text chunks in a local ChromaDB collection.
Chat Interface: Accepts user queries, retrieves the top 3 most relevant chunks, and feeds them into an Ollama-served LLM (default: gemma3:1b) for response generation.

📁 Local Model Caching

To ensure full offline capability and improve performance, QueryPDF automatically saves required models locally:

🧠 Tokenizer: On first run, the Hugging Face tokenizer (transformers.AutoTokenizer) is downloaded and saved to a local_tokenizer/ directory. Subsequent runs will use this local tokenizer even without an internet connection.
🔤 Embedding Model: The embedding model (e.g., intfloat/e5-small-v2) is saved locally under local_e5_small_v2/ on the first run. If the model is already cached, it will be reused without redownloading.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

QueryPDF

🔍 Features

🛠️ Technologies Used

✅ Requirements

📦 Installation

📂 Project Structure

⚙️ Configuration

▶️ Usage

⚙️ How It Works

📁 Local Model Caching

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
.python-version		.python-version
README.md		README.md
app.py		app.py
config.json		config.json
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
sample.pdf		sample.pdf

chaanakyaaM/QueryPDF

Folders and files

Latest commit

History

Repository files navigation

QueryPDF

🔍 Features

🛠️ Technologies Used

✅ Requirements

📦 Installation

📂 Project Structure

⚙️ Configuration

▶️ Usage

⚙️ How It Works

📁 Local Model Caching

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages