Fix multi-modal options #1

dihmandrake · 2025-10-11T15:35:36Z

This PR should provide some options for multi-modal abilities.

We cannot directly use the ollama_chat endpoint as it apparently causes problems with image upload, etc.
It works great for chat though.

One workaround was found @App0lyon (thank you very much for this). Directly calling ollma (without chat).
This works, but has the issues potentially missing previous context. Explicitly stated in the ADK Docs: https://google.github.io/adk-docs/agents/models/#using-ollama_chat-provider

Hence, there is the workaround from google/adk-python#49.
By faking the OpenAI API, we might get the best out of both still running with ollama

IMPORTANT. The ADK docs state: "It is important to set the provider ollama_chat instead of ollama. Using ollama will result in unexpected behaviors such as infinite tool call loops and ignoring previous context." Also bump to the latest ADK version

…t aware The OpenAI faked mode to ollama allows for multimodal chats The normal ollama_chat allows direct access to ollama, but has issues with images or multi-modal The mode ollama does not keep any context and might loop. See the ADK docs More infos also at: google/adk-python#49 https://google.github.io/adk-docs/agents/models/#using-ollama_chat-provider

dihmandrake added 3 commits October 11, 2025 13:20

Fix some permission issues

e2eb577

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix multi-modal options #1

Fix multi-modal options #1

Uh oh!

dihmandrake commented Oct 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Fix multi-modal options #1

Are you sure you want to change the base?

Fix multi-modal options #1

Uh oh!

Conversation

dihmandrake commented Oct 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant