From 502ddd4430bb888e422f0aa7100d5416ae8a8f46 Mon Sep 17 00:00:00 2001 From: Bihan Rana Date: Mon, 24 Nov 2025 18:08:16 +0545 Subject: [PATCH 1/2] Add dstack install method in docs Updated the dstack section Fix trailing whitespaces Fix missing backslash --- docs/get_started/install.md | 78 ++++++++++++++++++++++++++++++++++++- 1 file changed, 77 insertions(+), 1 deletion(-) diff --git a/docs/get_started/install.md b/docs/get_started/install.md index 0184c60b008..38c5108e951 100644 --- a/docs/get_started/install.md +++ b/docs/get_started/install.md @@ -128,7 +128,83 @@ sky status --endpoint 30000 sglang -## Method 7: Run on AWS SageMaker +## Method 7: Using dstack + +
+More + +[dstack](https://github.com/dstackai/dstack) simplifies GPU provisioning and workload orchestration across clouds, Kubernetes, and on-prem systems. + +Deploying SGLang as a secure, auto-scalable endpoint is straightforward: + +1. Install dstack: see [dstack's documentation](https://dstack.ai/docs/installation/) +2. Create a dstack [service](https://dstack.ai/docs/concepts/services/): + +
+Service configuration: service.yaml + +```yaml +type: service +name: qwen + +image: lmsysorg/sglang:latest +env: + - MODEL_ID=qwen/qwen2.5-0.5b-instruct +commands: + - | + python3 -m sglang.launch_server \ + --model-path $MODEL_ID \ + --port 8000 \ + --trust-remote-code +port: 8000 +model: qwen/qwen2.5-0.5b-instruct + +resources: + gpu: 8GB..24GB:1 +``` +
+ +Apply the configuration: + +```bash +HF_TOKEN= dstack apply -f service.yaml +``` + +3. If you want to enable auto-scaling, cache-aware routing, HTTPS, or bring your own custom domain, +create a [gateway](https://dstack.ai/docs/concepts/gateways/): + +
+Gateway configuration: gateway.yaml + +```yaml +type: gateway +name: sglang-gateway + +backend: aws +region: eu-west-1 + +# Specify your domain +domain: example.com + +router: + # (Optional) Enable cache-aware routing + type: sglang + policy: cache_aware +``` +
+ +Apply the gateway configuration. + +```bash +dstack apply -f gateway.yaml +``` + +Once the gateway is assigned a hostname, go to your domain's DNS settings and add a DNS record for `*.`. + +See the [SGLang example](https://dstack.ai/examples/inference/sglang/) for more details. +
+ +## Method 8: Run on AWS SageMaker
More From 84e678b78c1b2a18305fb34e4e8d932b0d02ec36 Mon Sep 17 00:00:00 2001 From: Bihan Rana Date: Tue, 25 Nov 2025 18:47:04 +0545 Subject: [PATCH 2/2] Add HF_TOKEN --- docs/get_started/install.md | 1 + 1 file changed, 1 insertion(+) diff --git a/docs/get_started/install.md b/docs/get_started/install.md index 38c5108e951..480c19774ef 100644 --- a/docs/get_started/install.md +++ b/docs/get_started/install.md @@ -149,6 +149,7 @@ name: qwen image: lmsysorg/sglang:latest env: + - HF_TOKEN - MODEL_ID=qwen/qwen2.5-0.5b-instruct commands: - |