Skip to content

Conversation

@philipph-askui
Copy link
Contributor

@philipph-askui philipph-askui commented Jan 21, 2026

This PR:

  • removes the modelRouter
  • adds a new model_store
  • removes chat-related code, and
  • refactors the doc
    (especially the last 2 bloat the PR a bit, sorry for that)

Summary

  • Removed the ModelRouter abstraction and string-based model selection
  • Introduced direct model injection and a discoverable model_store
  • Clarified model vs. MessagesApi responsibilities
  • Significantly reduced complexity and type errors
  • refactor the docs

Key Changes

  • Model Store (model_store): Central discovery + creation of models
  • Direct Injection: Models passed explicitly to VisionAgent
  • Per-call Overrides: act_model, get_model, locate_model
  • API and Codebase Cleanup:
    • Removed router, facade
      • Settings Architecture: Single source of truth at agent level
    • Clear parameter naming (*_settings, *_model)
    • Renamed Gemini models for clarity
    • Removes chat-related code
    • Removes support for unused models, e.g. UI-TARS

Breaking Change

  • all code that does not use the default models will likely break
  • String-based model selection is removed.
  • Users must migrate to direct model injection via model_store.

…del store

BREAKING CHANGE: Removed ModelRouter and ModelRegistry classes. Users must now use direct model injection.
@philipph-askui philipph-askui changed the title Chore/modelrouter Remove Modelrouter Jan 21, 2026
@philipph-askui philipph-askui changed the title Remove Modelrouter Remove modelRouter and add mode_store Jan 22, 2026
@philipph-askui philipph-askui marked this pull request as ready for review January 26, 2026 06:53
@philipph-askui philipph-askui changed the title Remove modelRouter and add mode_store Remove modelRouter and add model_store Jan 26, 2026
docs/01_Setup.md Outdated
**Problem**: Error connecting to Agent OS

**Solutions**:
1. Check if Agent OS is running (look for the system tray icon)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The agent OS doesn’t have a tray icon.

docs/01_Setup.md Outdated

**Solutions**:
1. Check if Agent OS is running (look for the system tray icon)
2. Restart Agent OS from your applications menu
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The agent OS is not listed in the application menu.

Comment on lines 240 to 247
custom_settings = ActSettings(
messages=MessageSettings(
max_tokens=8192,
temperature=0.5,
betas=["computer-use-2025-01-24"],
)
)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which system prompt is used in this example?
Could you please remove the betas?

docs/01_Setup.md Outdated

## Python Package Installation

AskUI Vision Agent requires Python 3.10 or higher.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

requires-python = ">=3.10,<3.14"

docs/01_Setup.md Outdated
Comment on lines 40 to 44
```bash
pip install askui[anthropic] # Anthropic Claude support
pip install askui[openrouter] # OpenRouter support
pip install askui[documents] # PDF, Excel, Word support
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SDK dosent support these targets

super().__init__(self.message)


class AnthropicModelSettings(BaseSettings):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently unused.

self,
locator: str | Locator,
image: ImageSource,
locate_settings: LocateSettings, # noqa: ARG002
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the Locate settings only needed for LLM-based locators?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

idea was to have a general settings object for all locate commands here

Comment on lines +46 to +61
max_tokens: int = 4096
temperature: float = Field(default=0.5, ge=0.0, le=1.0)
system_prompt: GetSystemPrompt | None = None
timeout: float | None = None


class LocateSettings(BaseModel):
"""Settings for LocateModel operations (UI element location)."""

model_config = ConfigDict(arbitrary_types_allowed=True)

query_type: str | None = None
confidence_threshold: float = Field(default=0.8, ge=0.0, le=1.0)
max_detections: int = 10
timeout: float | None = None
system_prompt: LocateSystemPrompt | None = None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, only the system prompt is being used.

confidence_threshold: float = Field(default=0.8, ge=0.0, le=1.0)
max_detections: int = 10
timeout: float | None = None
system_prompt: LocateSystemPrompt | None = None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Locate system prompt should not be configurable because the expected return is currently hard-coded. Changing the system prompt would cause the Locate code to fail.

timeout: float | None = None


class LocateSettings(BaseModel):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about removing the Locate and Get settings?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should generally discuss how "configurable" get and locate should be. This also includes if we want to support BYOM for these commands or if that should only be possible for act

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants