Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file removed assets/med_data_k_graph.png
Binary file not shown.
3 changes: 1 addition & 2 deletions concepts/cloud-architecture.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ This separation keeps your application management in the UI while all document d
| Component | Primary role | Typical hosting |
| --- | --- | --- |
| Cloud UI | Auth, orgs, billing, app metadata, dashboards | Vercel (or your web host) |
| Morphik Core | Ingestion, storage, retrieval, search, graphs, chat | EC2 or Kubernetes |
| Morphik Core | Ingestion, storage, retrieval, search, chat | EC2 or Kubernetes |
| Embedding GPU (optional) | Multimodal embeddings (ColPali API mode) | Lambda GPU, on-prem GPU |
| Postgres + pgvector | Documents, embeddings, app isolation | Neon or any Postgres |
| Object storage | Raw files and chunk payloads | S3 or local disk |
Expand Down Expand Up @@ -102,4 +102,3 @@ Agent mode runs in a server route (Cloud UI) so it can call your LLM provider se
- The UI calls `/api/agent/chat` on the Cloud UI.
- The server route calls Morphik Core for retrieval (using the app token).
- The server route streams the LLM response back to the browser.

6 changes: 2 additions & 4 deletions concepts/colpali.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ description: 'Using Late-interaction and Contrastive learning to achieve state-o

## Introduction

Upto now, we've seen RAG techniques that **i)** parse a given document, **ii)** convert it to text, and **iii)** embed the text for retrieval. These techniques have been particularly text-heavy. Embedding models expect text in, knowledge graphs expect text in, and parsers break down when provided with documents that aren't text-dominant. This motivates the question:
Upto now, we've seen RAG techniques that **i)** parse a given document, **ii)** convert it to text, and **iii)** embed the text for retrieval. These techniques have been particularly text-heavy. Embedding models expect text in, and parsers break down when provided with documents that aren't text-dominant. This motivates the question:

> When was the last time you looked at a document and only saw text?

Expand Down Expand Up @@ -57,7 +57,7 @@ from morphik import Morphik

db = Morphik("YOUR-URI-HERE")

db.ingest_file("report_with_images_and_graphs.pdf", use_colpali=True)
db.ingest_file("report_with_images_and_charts.pdf", use_colpali=True)
```

Here is an example query pathway:
Expand Down Expand Up @@ -109,5 +109,3 @@ If you're experiencing context limit issues with image-based retrieval, it may b





Loading