Skip to content
View khoi01's full-sized avatar
πŸ’­
waiting..
πŸ’­
waiting..

Block or report khoi01

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
khoi01/README.md

Hi β€” I'm khoi01 πŸ‘‹

πŸ€– AI & RAG Engineer | Applied ML Practitioner
Building production-ready AI systems with a strong focus on Retrieval-Augmented Generation (RAG), document intelligence, and automation.

Last updated: 2025-12-30


About

I design and ship production-ready AI systems that tightly integrate retrieval, grounding, and evaluation. My work centers on practical RAG pipelines and robust document intelligence for real-world applications β€” with an emphasis on reproducibility, observability, and private/hybrid deployments.

Current focus: improving retrieval quality, grounding, and evaluation metrics for real-world RAG systems.


What I work on

  • 🧠 RAG pipelines β€” chunking, embeddings, reranking, evaluation, and end-to-end orchestration
  • πŸ“„ PDF & knowledge ingestion β€” semantic and structure-aware chunking, layout-aware extraction
  • πŸ” Vector search β€” metadata-driven retrieval, filter-aware reranking, and hybrid search strategies
  • 🐳 Dockerized AI systems β€” reproducible deployments for private & hybrid infra
  • βš™οΈ Local & cloud LLMs β€” Ollama, OpenAI, Azure OpenAI (and integrations with other LLM providers)
  • πŸ” Automation & CI β€” automated testing, deployment pipelines, and reproducible experiments

Tech stack

  • Languages: Python
  • Libraries & Frameworks: LangChain, SentenceTransformers, QuartAPI
  • Vector stores: ChromaDB
  • Infrastructure: Docker, Docker Compose (for dev & production-like local stacks)
  • Other: embeddings, retrievers, rerankers, evaluation tooling, metrics, experiment tracking

Example RAG pipeline (high level)

  1. Ingest documents (PDF, HTML, DOCX) β†’ structure-aware parsing
  2. Chunk with semantic & layout-aware strategies (keep context for tables, headings, code blocks)
  3. Generate embeddings (SentenceTransformers) and store in vector DB (Chroma)
  4. Retrieve with metadata filters β†’ rerank with cross-encoder / reranker model
  5. Compose prompt with retrieved context β†’ call LLM (local or cloud) with grounding controls
  6. Evaluate: retrieval metrics, factuality checks, attribution scoring, and user-feedback loop

How I measure success

  • Retrieval: precision@k, recall@k, MRR for reranked candidates
  • Grounding: attribution coverage, hallucination reduction metrics
  • End-to-end: human evaluation, task success rate, latency & cost tradeoffs

Pinned Loading

  1. space_survival space_survival Public

    Dart

  2. sms_alert sms_alert Public

    Sms Alert Application

    Dart

  3. Agentic_Design_Patterns Agentic_Design_Patterns Public

    Forked from sarwarbeing-ai/Agentic_Design_Patterns

    Agentic Design Patterns: A Hands-On Guide to Building Intelligent Systems by Antonio Gulli

    Jupyter Notebook

  4. LangChain-From-Hero-To-Zero LangChain-From-Hero-To-Zero Public

    Python