π€ AI & RAG Engineer | Applied ML Practitioner
Building production-ready AI systems with a strong focus on Retrieval-Augmented Generation (RAG), document intelligence, and automation.
Last updated: 2025-12-30
I design and ship production-ready AI systems that tightly integrate retrieval, grounding, and evaluation. My work centers on practical RAG pipelines and robust document intelligence for real-world applications β with an emphasis on reproducibility, observability, and private/hybrid deployments.
Current focus: improving retrieval quality, grounding, and evaluation metrics for real-world RAG systems.
- π§ RAG pipelines β chunking, embeddings, reranking, evaluation, and end-to-end orchestration
- π PDF & knowledge ingestion β semantic and structure-aware chunking, layout-aware extraction
- π Vector search β metadata-driven retrieval, filter-aware reranking, and hybrid search strategies
- π³ Dockerized AI systems β reproducible deployments for private & hybrid infra
- βοΈ Local & cloud LLMs β Ollama, OpenAI, Azure OpenAI (and integrations with other LLM providers)
- π Automation & CI β automated testing, deployment pipelines, and reproducible experiments
- Languages: Python
- Libraries & Frameworks: LangChain, SentenceTransformers, QuartAPI
- Vector stores: ChromaDB
- Infrastructure: Docker, Docker Compose (for dev & production-like local stacks)
- Other: embeddings, retrievers, rerankers, evaluation tooling, metrics, experiment tracking
- Ingest documents (PDF, HTML, DOCX) β structure-aware parsing
- Chunk with semantic & layout-aware strategies (keep context for tables, headings, code blocks)
- Generate embeddings (SentenceTransformers) and store in vector DB (Chroma)
- Retrieve with metadata filters β rerank with cross-encoder / reranker model
- Compose prompt with retrieved context β call LLM (local or cloud) with grounding controls
- Evaluate: retrieval metrics, factuality checks, attribution scoring, and user-feedback loop
- Retrieval: precision@k, recall@k, MRR for reranked candidates
- Grounding: attribution coverage, hallucination reduction metrics
- End-to-end: human evaluation, task success rate, latency & cost tradeoffs

