Colaboration for Ollama / Gemini support

Hi,

I am currently working on personal toolkit for LLMs from diverse providers & while searching for effective methods to mine  large .pdf documents (eg books, pdf slides) Ive come across your project.

I am particularly interested in extending the functionality of scraping PDFs with other VLM Clients (especially locally hosted Ollama models). Instead of forking this project & building this for myself I thought it'd be nice to integrate this functionality into this project for others.

I've alread looked into your project & it would be necessary to generalize

``` python
def scrape_pdf(
    file_path: str,
    openai_client: Optional[OpenAI] = None,
```

to a VLM Wrapper Class, that handles the different syntaxes for Ollama / OpenAI / Gemini

``` python
def scrape_pdf(
    file_path: str,
    vlm_client: Optional[VLMClient] = None,
```

I already have a finished LLMClient wrapper that wraps multimodal capabilites for all mentioned Clients into a single object.
If you are interested in integrating this functionality I would be happy to cooperate.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Colaboration for Ollama / Gemini support #38

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Colaboration for Ollama / Gemini support #38

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions