feat: Event-Driven Document/Video Ingestion Pipeline#351
Merged
shubhadeepd merged 19 commits intorelease-v2.5.0from Feb 27, 2026
Merged
feat: Event-Driven Document/Video Ingestion Pipeline#351shubhadeepd merged 19 commits intorelease-v2.5.0from
shubhadeepd merged 19 commits intorelease-v2.5.0from
Conversation
smasurekar
reviewed
Feb 16, 2026
smasurekar
reviewed
Feb 16, 2026
nv-pranjald
reviewed
Feb 16, 2026
smasurekar
reviewed
Feb 17, 2026
smasurekar
reviewed
Feb 17, 2026
smasurekar
reviewed
Feb 17, 2026
nv-pranjald
reviewed
Feb 17, 2026
nv-pranjald
approved these changes
Feb 26, 2026
smasurekar
approved these changes
Feb 27, 2026
116e7eb to
0e0d7bb
Compare
…stion pipeline - Kafka consumer that monitors MinIO object storage for new uploads - Routes documents to RAG Ingestor, videos to VSS for analysis - Docker Compose deployment for Kafka, MinIO, and consumer - Jupyter notebook for end-to-end deployment and testing - Sample test data (PDF document, MP4 video) tracked via Git LFS Signed-off-by: Minh Nguyen <minhngu@nvidia.com> Made-with: Cursor Signed-off-by: Minh Nguyen <minhngu@nvidia.com> Made-with: Cursor
Signed-off-by: Minh Nguyen <minhngu@nvidia.com> Made-with: Cursor Signed-off-by: Minh Nguyen <minhngu@nvidia.com> Made-with: Cursor
Signed-off-by: Minh Nguyen <minhngu@nvidia.com> Made-with: Cursor Signed-off-by: Minh Nguyen <minhngu@nvidia.com> Made-with: Cursor
Signed-off-by: Minh Nguyen <minhngu@nvidia.com> Made-with: Cursor Signed-off-by: Minh Nguyen <minhngu@nvidia.com> Made-with: Cursor
Signed-off-by: Minh Nguyen <minhngu@nvidia.com> Made-with: Cursor Signed-off-by: Minh Nguyen <minhngu@nvidia.com> Made-with: Cursor
Signed-off-by: Minh Nguyen <minhngu@nvidia.com> Made-with: Cursor Signed-off-by: Minh Nguyen <minhngu@nvidia.com> Made-with: Cursor
… logs Signed-off-by: Minh Nguyen <minhngu@nvidia.com> Made-with: Cursor Signed-off-by: Minh Nguyen <minhngu@nvidia.com> Made-with: Cursor
Signed-off-by: Minh Nguyen <minhngu@nvidia.com> Made-with: Cursor Signed-off-by: Minh Nguyen <minhngu@nvidia.com> Made-with: Cursor
Signed-off-by: Minh Nguyen <minhngu@nvidia.com> Made-with: Cursor Signed-off-by: Minh Nguyen <minhngu@nvidia.com> Made-with: Cursor
Signed-off-by: Minh Nguyen <minhngu@nvidia.com> Made-with: Cursor Signed-off-by: Minh Nguyen <minhngu@nvidia.com> Made-with: Cursor
…consumer prompts - Add verify_file_in_storage() helper to confirm files landed in MinIO - Merge storage verification into document/video ingestion checks - Add RAG Frontend UI link (port 8090) to query sections - Make Kafka consumer VSS prompts configurable via env vars in docker-compose - Install git/git-lfs in notebook setup cell - Index cells in Deploy Continuous Ingestion section Signed-off-by: Minh Nguyen <minhngu@nvidia.com> Made-with: Cursor Signed-off-by: Minh Nguyen <minhngu@nvidia.com> Made-with: Cursor
Signed-off-by: Minh Nguyen <minhngu@nvidia.com> Made-with: Cursor Signed-off-by: Minh Nguyen <minhngu@nvidia.com> Made-with: Cursor
Add rag_event_ingest.ipynb notebook that provides an end-to-end walkthrough for: - Deploying NVIDIA RAG stack (NIMs, Milvus, Ingestor, RAG Server) - Deploying NVIDIA VSS stack (VLM, LLM, Embedding, Reranker NIMs) - Deploying continuous ingestion pipeline (Kafka, MinIO, Kafka Consumer) - Configurable video analysis prompts for the Kafka consumer - Uploading documents and videos to MinIO with storage verification - Verifying ingestion via consumer logs - Querying ingested content via RAG API or Frontend UI Signed-off-by: Minh Nguyen <minhngu@nvidia.com> Made-with: Cursor Signed-off-by: Minh Nguyen <minhngu@nvidia.com> Made-with: Cursor
…te hw req to 4 GPUs Signed-off-by: Minh Nguyen <minhngu@nvidia.com> Made-with: Cursor Signed-off-by: Minh Nguyen <minhngu@nvidia.com> Made-with: Cursor
…ng/reranker The via-server runs on the local_deployment_single_gpu_default network, not nvidia-rag, so it cannot resolve nemoretriever-embedding-ms or nemoretriever-ranking-ms. Route through host.docker.internal with the correct host-mapped ports instead (9080 for embedding, 1976 for reranker). Co-authored-by: Cursor <cursoragent@cursor.com> Signed-off-by: Minh Nguyen <minhngu@nvidia.com> Made-with: Cursor Signed-off-by: Minh Nguyen <minhngu@nvidia.com> Made-with: Cursor
0e0d7bb to
6d9f001
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Add rag_event_ingest.ipynb notebook and supporting changes for an end-to-end continuous ingestion pipeline that monitors object storage (MinIO) for new document and video uploads, automatically routes them to the appropriate AI services, and makes all content searchable via RAG.
Notebook walkthrough:
Checklist
git commit -s) and GPG signed (git commit -S).