GitHub - kacperx0m/mini-RAG: This is a mini semi local RAG

Installation:

Download ollama from: https://ollama.com/

Run downloaded app or type:
ollama serve

Download your model, but keep in mind that app was tested using llama3.2:1b.
ollama pull llama3.2:1b

Install dependencies:
pip install requirements.txt

Put your pdf documents inside documents folder

Activate virtual env using command:
./mini-rag/Scripts/activate

run python main.py

Go to 127.0.0.1:8000 and wait if needed for the system to load. Refresh from time to time or look at console to see if loading is completed.

Now you should be able to ask your questions freely.

Questions:

What restrictions does this app have?

It uses CPU for calculations which in turn takes long time to process,
Embeddings and retrieval are sensitive to used words, queries and sentences,
Answers are non deterministic,
Number of embeddings is restricted by RAM size.

What would you correct in the app if you had more time?

Improve embeddings calculations,
Move heavy tasks to GPU,
Add vector database,
Make app more responsive.

How would you prepare this system for production in terms of scaling and monitoring?

Use better hardware,
Find and use better was for chunking and retrieval,
Find and use better embedding models,
Find and use better LLM models,
Find and switch to better prompts,
Allow LLMs for external knowledge use or scrap data if needed,
Use vector database,
Add chat history,
Return more detailed sources,
Add more tests.

Testing:

To test, run pytest -v test_rag.py

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
documents		documents
.gitignore		.gitignore
README.md		README.md
main.py		main.py
rag_system.py		rag_system.py
requirements.txt		requirements.txt
test_rag.py		test_rag.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Installation:

Questions:

Testing:

About

Uh oh!

Releases

Packages

Languages

kacperx0m/mini-RAG

Folders and files

Latest commit

History

Repository files navigation

Installation:

Questions:

Testing:

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages