From 7bf73600d00a007ddc2d86de60bdb96195974b24 Mon Sep 17 00:00:00 2001
From: protonicage <protonicage@gmail.com>
Date: Fri, 19 Dec 2025 16:04:21 +0100
Subject: [PATCH] Update README.md

Add tensorrt llm hint
---
 README.md | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/README.md b/README.md
index 93e97b54..716640fc 100644
--- a/README.md
+++ b/README.md
@@ -1522,6 +1522,8 @@ to load the model after the server has been started. The model loading API is
 currently not supported during the `auto_complete_config` and `finalize`
 functions.
 
+The model loading API applies to repository-managed backends.
+TensorRT-LLM models must be launched via the TensorRT-LLM launcher and cannot be instantiated via pb_utils.load_model(files=...).
 ## Using BLS with Stateful Models
 
 [Stateful models](https://github.com/triton-inference-server/server/blob/main/docs/user_guide/architecture.md#stateful-models)