Article Retrieval System (ARS)

Introduction

This Article Retrieval System (ARS) leverages advanced NLP techniques and efficient similarity search algorithms to find and recommend articles based on textual queries. Embeddings, PCA for dimensionality reduction, and FAISS for quick nearest neighbor searching, ARS provides an efficient and scalable solution for navigating large datasets of articles.

Features

Text Embedding: Converts article titles and texts into dense vector representations.
Dimensionality Reduction: Uses PCA to reduce the dimensionality of the embeddings, enhancing search efficiency.
Efficient Search: Utilizes FAISS, an efficient library for similarity search, to quickly find the most relevant articles.
Customizable Querying: Allows querying with adjustable number of results (k-nearest articles).

Prerequisites

Python 3.9+
Libraries: transformers, safetensors, langchain, chromadb, faiss-cpu, nltk, bitsandbytes, pandas, sklearn, tiktoken, sentence-transformers, torch, accelerate

Warning: After installing the accelerate library, you may need to restart your kernel to ensure that all changes are properly applied. This is necessary to load the newly installed packages into the working environment.

Usage

Prepare your dataset:

Ensure your dataset is in a CSV format with at least the columns Title and Text. At the start of the code, add the path to the CSV file and offload path.

Set up the environment:

Load the necessary models and preprocess your data.

Run the embedding and PCA setup:

Execute the initial scripts to transform your article data into a searchable index.

Query articles:

Use the search_articles function to find articles similar to a given query.

RAG

Use the generate_responce function to get respnse using RAG implementation

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
ARS.ipynb		ARS.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Article Retrieval System (ARS)

Introduction

Features

Prerequisites

Usage

Prepare your dataset:

Set up the environment:

Run the embedding and PCA setup:

Query articles:

RAG

About

Uh oh!

Releases

Packages

Languages

hamzainsaf/RAG

Folders and files

Latest commit

History

Repository files navigation

Article Retrieval System (ARS)

Introduction

Features

Prerequisites

Usage

Prepare your dataset:

Set up the environment:

Run the embedding and PCA setup:

Query articles:

RAG

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages