Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -1522,6 +1522,8 @@ to load the model after the server has been started. The model loading API is
currently not supported during the `auto_complete_config` and `finalize`
functions.

The model loading API applies to repository-managed backends.
TensorRT-LLM models must be launched via the TensorRT-LLM launcher and cannot be instantiated via pb_utils.load_model(files=...).
## Using BLS with Stateful Models

[Stateful models](https://github.com/triton-inference-server/server/blob/main/docs/user_guide/architecture.md#stateful-models)
Expand Down