Skip to content

Conversation

@ahmedjawedaj
Copy link
Contributor

  • Added forced_temperature capability in capabilities.py
  • Implemented override logic in cloud_provider.py for complete() and stream()
  • Added unit tests verifying override behavior

Description

A clear and concise description of the changes.

Related Issues

  • Closes #...
  • Related to #...

Module(s) Affected

  • agents
  • api
  • config
  • core
  • knowledge
  • logging
  • services
  • tools
  • utils
  • web (Frontend)
  • docs (Documentation)
  • scripts
  • tests
  • Other: ...

Checklist

  • I have read and followed the contribution guidelines.
  • My code follows the project's coding standards.
  • I have run pre-commit run --all-files and fixed any issues.
  • I have added relevant tests for my changes.
  • I have updated the documentation (if necessary).
  • My changes do not introduce any new security vulnerabilities.

Additional Notes

Add any other context or screenshots about the pull request here.

- Added forced_temperature capability in capabilities.py
- Implemented override logic in cloud_provider.py for complete() and stream()
- Added unit tests verifying override behavior
@scrrlt
Copy link
Contributor

scrrlt commented Jan 16, 2026

requested changes:

  • move all parameter sanitization logic to new file src/services/llm/parameters.py.
  • centralize logic here to avoid cluttering cloud_provider.py.

src/services/llm/capabilities.py

  • remove speculative entries (gpt-5, o3).
  • add MODEL_CONSTRAINTS dict for strict rules (e.g. temperature_strategy).
  • keep get_capability for feature flags.
  • add get_model_constraint for enforcing limits.
  1. src/services/llm/parameters.py (new)
  • implement sanitize_model_params(binding, model, kwargs).
  • check temperature_strategy:
  • fixed: overwrite user input (e.g. o1 -> 1.0).
  • drop: remove key if model rejects it.
  • resolve max_tokens vs max_completion_tokens dynamically.
  • import/define DEFAULT_MAX_TOKENS constant (no magic numbers).
  1. src/services/llm/cloud_provider.py
  • import sanitize_model_params.
  • remove inline logic (the if forced_temp block).
  • remove hardcoded 4096 fallback.
  • delegate payload construction to parameters.py.
  1. src/services/llm/config.py

ensure get_token_limit_kwargs correctly maps keys based on model version (legacy max_tokens vs new max_completion_tokens).

  1. testing
  • drop complex asyncmock setup for this feature.
  • add tests/services/llm/test_parameters.py.
  • write pure unit tests (table-driven) to verify logic in isolation.

in the midst of refactoring llm section so this would make my life much easier!! <3 added test is good practice :)

Per PR HKUDS#143 review feedback:

- Create src/services/llm/parameters.py with:
  - MODEL_CONSTRAINTS dict for strict rules (temperature_strategy)
  - get_model_constraint() for enforcing limits
  - sanitize_model_params() centralizing all parameter transformation
  - DEFAULT_MAX_TOKENS constant (no magic numbers)

- Update src/services/llm/capabilities.py:
  - Remove speculative entries (gpt-5, o3)
  - Remove forced_temperature (now in MODEL_CONSTRAINTS)

- Refactor src/services/llm/cloud_provider.py:
  - Delegate to sanitize_model_params, remove inline logic

- Add tests/services/llm/test_parameters.py:
  - 52 table-driven unit tests for isolated logic testing
@pancacake
Copy link
Collaborator

Thanks for your contribution! Sadly there are some conflicts with our current repo. We've fixed it in the newest update, please refers to: 51902b1

@pancacake pancacake closed this Jan 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants