π Learn from everywhere - An intelligent AI assistant that helps you learn from documents, web pages, and videos with complete citation grounding
Built with FastAPI, Streamlit, smolagents, Qdrant, and Gemini 2.0 Flash
Tagline: "Learn from everywhere"
- β Multi-Source Support: Upload PDFs, DOCX, PPT, web URLs, and YouTube videos
- β Agentic AI: Powered by smolagents with specialized tools for each content type
- β Complete Grounding: Every answer includes page numbers/timestamps with source citations
- β Multi-Modal: Understands text, images, audio, and video using Gemini 2.0 Flash
- β Vector Search: Lightning-fast semantic search with Qdrant
- β Interactive UI: Clean Streamlit interface with source selection and chat
- β Production Ready: Docker-based deployment with proper separation of concerns
- Python 3.11+
- Docker & Docker Compose
- Google AI API Key (Get from https://makersuite.google.com/app/apikey)
Download all files and organize them according to the structure in GroundRAG-Setup-Guide.pdf
# Backend
cd backend
cp .env.example .env
# Edit .env and add your GOOGLE_API_KEY
# Frontend
cd ../frontend
cp .env.example .envdocker-compose up --buildWait for all services to start, then access:
- Frontend UI: http://localhost:8501
- Backend API: http://localhost:8000/docs
- Qdrant: http://localhost:6333/dashboard
Terminal 1 - Qdrant:
docker run -p 6333:6333 qdrant/qdrant:latestTerminal 2 - Backend:
cd backend
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
pip install -r requirements.txt
uvicorn app.main:app --reloadTerminal 3 - Frontend:
cd frontend
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
pip install -r requirements.txt
streamlit run app.py- Documents: Upload PDF, DOCX, or PPT files
- Web Pages: Enter any URL
- Videos: Paste YouTube URL
- Select sources using checkboxes
- Type your question
- Get answers with complete citations
Answers include inline citations like:
"Linear algebra is the study of vectors [Source: Math_Textbook.pdf, Page: 5]"
Click/hover on citations to see source preview.
βββββββββββββββ ββββββββββββββββ βββββββββββββββ
β Streamlit ββββββΆβ FastAPI ββββββΆβ Qdrant β
β Frontend β β Backend β β Vector DB β
βββββββββββββββ ββββββββββββββββ βββββββββββββββ
β
βΌ
ββββββββββββββββ
β smolagents β
β + Gemini AI β
ββββββββββββββββ
Components:
- Frontend: Streamlit (Python web UI)
- Backend: FastAPI (REST APIs)
- Agent: smolagents (Agentic orchestration)
- Vector DB: Qdrant (Semantic search)
- LLM: Gemini 2.0 Flash (Multi-modal AI)
omnilearn/
βββ backend/ # FastAPI backend
β βββ app/
β β βββ main.py # API entrypoint
β β βββ agents/ # smolagents setup
β β βββ api/ # REST endpoints
β β βββ services/ # Business logic
β β βββ models/ # Pydantic models
β β βββ db/ # Qdrant client
β βββ requirements.txt
β
βββ frontend/ # Streamlit frontend
β βββ app.py # Main UI
β βββ components/ # UI components
β βββ services/ # API client
β
βββ docker-compose.yml # Multi-container setup
GOOGLE_API_KEY=your_key_here
QDRANT_HOST=localhost
QDRANT_PORT=6333
LLM_MODEL=gemini/gemini-2.0-flash
CHUNK_SIZE=1500
CHUNK_OVERLAP=200BACKEND_API_URL=http://localhost:8000curl http://localhost:8000/healthcurl -X POST "http://localhost:8000/api/v1/upload/file" \
-F "file=@test.pdf"curl -X POST "http://localhost:8000/api/v1/chat" \
-H "Content-Type: application/json" \
-d '{"query": "What is linear algebra?", "source_ids": []}'- Setup Guide:
OmniLearnAI-Setup-Guide.pdf(Complete installation instructions) - Implementation Guide:
OmniLearnAI-Implementation-Guide.pdf(Full code documentation) - API Docs: http://localhost:8000/docs (Interactive Swagger UI)
| Component | Technology | Version |
|---|---|---|
| Backend Framework | FastAPI | 0.115+ |
| Frontend Framework | Streamlit | 1.40+ |
| Agentic AI | smolagents | 0.3+ |
| LLM Provider | LiteLLM | 1.52+ |
| LLM Model | Gemini 2.0 Flash | Latest |
| Vector Database | Qdrant | 1.12+ |
| Document Processing | LangChain | 0.3+ |
| Embeddings | Google AI | text-embedding-004 |
Uses tool-calling agents that automatically select the right tool:
RetrieverTool: Search documentsImageUnderstandingTool: Analyze imagesAudioUnderstandingTool: Analyze audioYouTubeVideoUnderstandingTool: Analyze videos
Every answer includes:
- Source document name
- Page number or timestamp
- Preview text from original source
- Confidence score
Gemini 2.0 Flash natively supports:
- Text documents
- Images (OCR, object detection)
- Audio files (transcription, analysis)
- YouTube videos (understanding, Q&A)
- Create tool in
backend/app/agents/tools/ - Add to agent in
backend/app/agents/masa_agent.py - Test with sample inputs
- Edit components in
frontend/components/ - Update
frontend/app.pyif needed - Streamlit auto-reloads
Edit backend/.env:
LLM_MODEL=gpt-4o # or claude-3-5-sonnet, etc.- Check
GOOGLE_API_KEYin.env - Verify Qdrant is running
- Check port 8000 is free
- Verify backend is running at port 8000
- Check
BACKEND_API_URLin frontend.env
- Ensure documents have page metadata
- Check chunking configuration
- Review agent system prompt
docker-compose -f docker-compose.prod.yml up -d- Qdrant: Use Qdrant Cloud
- Backend: Deploy to AWS/GCP/Azure
- Frontend: Deploy to Streamlit Cloud
- Environment: Set production env vars
- HTTPS: Configure SSL certificates
- Advanced citation UI with hover previews
- Multi-user support with authentication
- Chat history persistence
- Export conversations
- Custom model fine-tuning
- Advanced filtering and search
- Mobile-responsive UI
- API rate limiting
- Monitoring dashboard
Apache 2.0 License
- HuggingFace: smolagents framework
- Google AI: Gemini 2.0 Flash API
- Qdrant: Vector database
- FastAPI: Web framework
- Streamlit: UI framework
For issues or questions:
- Check
OmniLearnAI-Setup-Guide.pdf - Review troubleshooting section
- Check API logs:
docker-compose logs -f
Built with β€οΈ for learners everywhere - students, researchers, and knowledge workers
"Learn from everywhere" - Start with docker-compose up! π