diff --git a/docs/docs/concepts/dev-environments.md b/docs/docs/concepts/dev-environments.md index 4d46e73ac4..bda3406a61 100644 --- a/docs/docs/concepts/dev-environments.md +++ b/docs/docs/concepts/dev-environments.md @@ -301,11 +301,11 @@ If you don't assign a value to an environment variable (see `HF_TOKEN` above), ### Working directory -If `working_dir` is not specified, it defaults to `/workflow`. +If `working_dir` is not specified, it defaults to the working directory set in the Docker image. For example, the [default image](#default-image) uses `/dstack/run` as its working directory. -The `working_dir` must be an absolute path. The tilde (`~`) is supported (e.g., `~/my-working-dir`). +If the Docker image does not have a working directory set, `dstack` uses `/` as the `working_dir`. - +The `working_dir` must be an absolute path. The tilde (`~`) is supported (e.g., `~/my-working-dir`). @@ -320,7 +320,7 @@ type: dev-environment name: vscode files: - - .:examples # Maps the directory where `.dstack.yml` to `/workflow/examples` + - .:examples # Maps the directory with `.dstack.yml` to `/examples` - ~/.ssh/id_rsa:/root/.ssh/id_rsa # Maps `~/.ssh/id_rsa` to `/root/.ssh/id_rsa` ide: vscode @@ -329,7 +329,7 @@ ide: vscode If the local path is relative, it’s resolved relative to the configuration file. -If the container path is relative, it’s resolved relative to `/workflow`. +If the container path is relative, it’s resolved relative to the [working directory](#working-directory). The container path is optional. If not specified, it will be automatically calculated: @@ -340,7 +340,7 @@ type: dev-environment name: vscode files: - - ../examples # Maps `examples` (the parent directory of `.dstack.yml`) to `/workflow/examples` + - ../examples # Maps the parent directory of `.dstack.yml` to `/../examples` - ~/.ssh/id_rsa # Maps `~/.ssh/id_rsa` to `/root/.ssh/id_rsa` ide: vscode @@ -355,9 +355,9 @@ ide: vscode ### Repos -Sometimes, you may want to mount an entire Git repo inside the container. +Sometimes, you may want to clone an entire Git repo inside the container. -Imagine you have a cloned Git repo containing an `examples` subdirectory with a `.dstack.yml` file: +Imagine you have a Git repo (clonned locally) containing an `examples` subdirectory with a `.dstack.yml` file:
@@ -366,8 +366,7 @@ type: dev-environment name: vscode repos: - # Mounts the parent directory of `examples` (must be a Git repo) - # to `/workflow` (the default working directory) + # Clones the repo from the parent directory (`examples/..`) to `` - .. ide: vscode @@ -375,15 +374,13 @@ ide: vscode
-When you run it, `dstack` fetches the repo on the instance, applies your local changes, and mounts it—so the container matches your local repo. +When you run it, `dstack` clones the repo on the instance, applies your local changes, and mounts it—so the container matches your local repo. The local path can be either relative to the configuration file or absolute. ??? info "Repo directory" - By default, `dstack` mounts the repo to `/workflow` (the default working directory). + By default, `dstack` clones the repo to the [working directory](#working-directory). - - You can override the repo directory using either a relative or an absolute path:
@@ -393,8 +390,7 @@ The local path can be either relative to the configuration file or absolute. name: vscode repos: - # Mounts the parent directory of `examples` (must be a Git repo) - # to `/my-repo` + # Clones the repo in the parent directory (`examples/..`) to `/my-repo` - ..:/my-repo ide: vscode @@ -402,7 +398,22 @@ The local path can be either relative to the configuration file or absolute.
- If the path is relative, it is resolved against [working directory](#working-directory). + > If the repo directory is relative, it is resolved against [working directory](#working-directory). + + If the repo directory is not empty, the run will fail with a runner error. + To override this behavior, you can set `if_exists` to `skip`: + + ```yaml + type: dev-environment + name: vscode + + repos: + - local_path: .. + path: /my-repo + if_exists: skip + + ide: vscode + ``` ??? info "Repo size" @@ -411,7 +422,7 @@ The local path can be either relative to the configuration file or absolute. You can increase the 2MB limit by setting the `DSTACK_SERVER_CODE_UPLOAD_LIMIT` environment variable. ??? info "Repo URL" - Sometimes you may want to mount a Git repo without cloning it locally. In this case, simply provide a URL in `repos`: + Sometimes you may want to clone a Git repo within the container without cloning it locally. In this case, simply provide a URL in `repos`:
@@ -420,7 +431,7 @@ The local path can be either relative to the configuration file or absolute. name: vscode repos: - # Clone the specified repo to `/workflow` (the default working directory) + # Clone the repo to `` - https://github.com/dstackai/dstack ide: vscode @@ -432,9 +443,9 @@ The local path can be either relative to the configuration file or absolute. If a Git repo is private, `dstack` will automatically try to use your default Git credentials (from `~/.ssh/config` or `~/.config/gh/hosts.yml`). - If you want to use custom credentials, you can provide them with [`dstack init`](../reference/cli/dstack/init.md). + > If you want to use custom credentials, ensure to pass them via [`dstack init`](../reference/cli/dstack/init.md) before submitting a run. -> Currently, you can configure up to one repo per run configuration. +Currently, you can configure up to one repo per run configuration. ### Retry policy diff --git a/docs/docs/concepts/services.md b/docs/docs/concepts/services.md index 24a0187de8..745f78e3f0 100644 --- a/docs/docs/concepts/services.md +++ b/docs/docs/concepts/services.md @@ -597,15 +597,12 @@ resources: ### Working directory -If `working_dir` is not specified, it defaults to `/workflow`. +If `working_dir` is not specified, it defaults to the working directory set in the Docker image. For example, the [default image](#default-image) uses `/dstack/run` as its working directory. -!!! info "No commands" - If you’re using a custom `image` without `commands`, then `working_dir` is taken from `image`. +If the Docker image does not have a working directory set, `dstack` uses `/` as the `working_dir`. The `working_dir` must be an absolute path. The tilde (`~`) is supported (e.g., `~/my-working-dir`). - - ### Files @@ -621,7 +618,7 @@ type: service name: llama-2-7b-service files: - - .:examples # Maps the directory where `.dstack.yml` to `/workflow/examples` + - .:examples # Maps the directory with `.dstack.yml` to `/examples` - ~/.ssh/id_rsa:/root/.ssh/id_rsa # Maps `~/.ssh/id_rsa` to `/root/.ssh/id_rsa` python: 3.12 @@ -640,11 +637,10 @@ resources:
-Each entry maps a local directory or file to a path inside the container. Both local and container paths can be relative or absolute. - -If the local path is relative, it’s resolved relative to the configuration file. If the container path is relative, it’s resolved relative to `/workflow`. +If the local path is relative, it’s resolved relative to the configuration file. +If the container path is relative, it’s resolved relative to the [working directory](#working-directory). -The container path is optional. If not specified, it will be automatically calculated. +The container path is optional. If not specified, it will be automatically calculated: @@ -655,7 +651,7 @@ type: service name: llama-2-7b-service files: - - ../examples # Maps `examples` (the parent directory of `.dstack.yml`) to `/workflow/examples` + - ../examples # Maps the parent directory of `.dstack.yml` to `/../examples` - ~/.ssh/id_rsa # Maps `~/.ssh/id_rsa` to `/root/.ssh/id_rsa` python: 3.12 @@ -681,9 +677,9 @@ resources: ### Repos -Sometimes, you may want to mount an entire Git repo inside the container. +Sometimes, you may want to clone an entire Git repo inside the container. -Imagine you have a cloned Git repo containing an `examples` subdirectory with a `.dstack.yml` file: +Imagine you have a Git repo (clonned locally) containing an `examples` subdirectory with a `.dstack.yml` file: @@ -694,8 +690,7 @@ type: service name: llama-2-7b-service repos: - # Mounts the parent directory of `examples` (must be a Git repo) - # to `/workflow` (the default working directory) + # Clones the repo from the parent directory (`examples/..`) to `` - .. python: 3.12 @@ -714,12 +709,12 @@ resources: -When you run it, `dstack` fetches the repo on the instance, applies your local changes, and mounts it—so the container matches your local repo. +When you run it, `dstack` clones the repo on the instance, applies your local changes, and mounts it—so the container matches your local repo. The local path can be either relative to the configuration file or absolute. ??? info "Repo directory" - By default, `dstack` mounts the repo to `/workflow` (the default working directory). + By default, `dstack` clones the repo to the [working directory](#working-directory). @@ -732,8 +727,7 @@ The local path can be either relative to the configuration file or absolute. name: llama-2-7b-service repos: - # Mounts the parent directory of `examples` (must be a Git repo) - # to `/my-repo` + # Clones the repo in the parent directory (`examples/..`) to `/my-repo` - ..:/my-repo python: 3.12 @@ -752,7 +746,33 @@ The local path can be either relative to the configuration file or absolute. - If the path is relative, it is resolved against `working_dir`. + > If the repo directory is relative, it is resolved against [working directory](#working-directory). + + If the repo directory is not empty, the run will fail with a runner error. + To override this behavior, you can set `if_exists` to `skip`: + + ```yaml + type: service + name: llama-2-7b-service + + repos: + - local_path: .. + path: /my-repo + if_exists: skip + + python: 3.12 + + env: + - HF_TOKEN + - MODEL=NousResearch/Llama-2-7b-chat-hf + commands: + - uv pip install vllm + - python -m vllm.entrypoints.openai.api_server --model $MODEL --port 8000 + port: 8000 + + resources: + gpu: 24GB + ``` ??? info "Repo size" The repo size is not limited. However, local changes are limited to 2MB. @@ -760,8 +780,7 @@ The local path can be either relative to the configuration file or absolute. You can increase the 2MB limit by setting the `DSTACK_SERVER_CODE_UPLOAD_LIMIT` environment variable. ??? info "Repo URL" - - Sometimes you may want to mount a Git repo without cloning it locally. In this case, simply provide a URL in `repos`: + Sometimes you may want to clone a Git repo within the container without cloning it locally. In this case, simply provide a URL in `repos`: @@ -772,7 +791,7 @@ The local path can be either relative to the configuration file or absolute. name: llama-2-7b-service repos: - # Clone the specified repo to `/workflow` (the default working directory) + # Clone the repo to `` - https://github.com/dstackai/dstack python: 3.12 @@ -795,9 +814,9 @@ The local path can be either relative to the configuration file or absolute. If a Git repo is private, `dstack` will automatically try to use your default Git credentials (from `~/.ssh/config` or `~/.config/gh/hosts.yml`). - If you want to use custom credentials, you can provide them with [`dstack init`](../reference/cli/dstack/init.md). + > If you want to use custom credentials, you can provide them with [`dstack init`](../reference/cli/dstack/init.md). -> Currently, you can configure up to one repo per run configuration. +Currently, you can configure up to one repo per run configuration. ### Retry policy diff --git a/docs/docs/concepts/tasks.md b/docs/docs/concepts/tasks.md index d6e2fb5591..0d516af780 100644 --- a/docs/docs/concepts/tasks.md +++ b/docs/docs/concepts/tasks.md @@ -32,7 +32,7 @@ commands: - uv pip install trl - | trl sft \ - --model_name_or_path $MODEL --dataset_name $DATASET + --model_name_or_path $MODEL --dataset_name $DATASET \ --num_processes $DSTACK_GPUS_PER_NODE resources: @@ -199,7 +199,7 @@ commands: - uv pip install trl - | trl sft \ - --model_name_or_path $MODEL --dataset_name $DATASET + --model_name_or_path $MODEL --dataset_name $DATASET \ --num_processes $DSTACK_GPUS_PER_NODE resources: @@ -276,7 +276,7 @@ commands: - uv pip install trl - | trl sft \ - --model_name_or_path $MODEL --dataset_name $DATASET + --model_name_or_path $MODEL --dataset_name $DATASET \ --num_processes $DSTACK_GPUS_PER_NODE resources: @@ -417,7 +417,7 @@ resources: ```yaml type: task -name: trl-sft +name: trl-sft python: 3.12 @@ -431,7 +431,7 @@ commands: - uv pip install trl - | trl sft \ - --model_name_or_path $MODEL --dataset_name $DATASET + --model_name_or_path $MODEL --dataset_name $DATASET \ --num_processes $DSTACK_GPUS_PER_NODE resources: @@ -463,15 +463,12 @@ If you don't assign a value to an environment variable (see `HF_TOKEN` above), ### Working directory -If `working_dir` is not specified, it defaults to `/workflow`. +If `working_dir` is not specified, it defaults to the working directory set in the Docker image. For example, the [default image](#default-image) uses `/dstack/run` as its working directory. -!!! info "No commands" - If you’re using a custom `image` without `commands`, then `working_dir` is taken from `image`. +If the Docker image does not have a working directory set, `dstack` uses `/` as the `working_dir`. The `working_dir` must be an absolute path. The tilde (`~`) is supported (e.g., `~/my-working-dir`). - - ### Files @@ -485,7 +482,7 @@ type: task name: trl-sft files: - - .:examples # Maps the directory where `.dstack.yml` to `/workflow/examples` + - .:examples # Maps the directory with `.dstack.yml` to `/examples` - ~/.ssh/id_rsa:/root/.ssh/id_rsa # Maps `~/.ssh/id_rsa` to `/root/.ssh/id_rs python: 3.12 @@ -500,7 +497,7 @@ commands: - uv pip install trl - | trl sft \ - --model_name_or_path $MODEL --dataset_name $DATASET + --model_name_or_path $MODEL --dataset_name $DATASET \ --num_processes $DSTACK_GPUS_PER_NODE resources: @@ -509,11 +506,10 @@ resources: -Each entry maps a local directory or file to a path inside the container. Both local and container paths can be relative or absolute. +If the local path is relative, it’s resolved relative to the configuration file. +If the container path is relative, it’s resolved relative to the [working directory](#working-directory). -If the local path is relative, it’s resolved relative to the configuration file. If the container path is relative, it’s resolved relative to `/workflow`. - -The container path is optional. If not specified, it will be automatically calculated. +The container path is optional. If not specified, it will be automatically calculated: @@ -521,11 +517,11 @@ The container path is optional. If not specified, it will be automatically calcu ```yaml type: task -name: trl-sft +name: trl-sft files: - - ../examples # Maps `examples` (the parent directory of `.dstack.yml`) to `/workflow/examples` - - ~/.cache/huggingface/token # Maps `~/.cache/huggingface/token` to `/root/~/.cache/huggingface/token` + - ../examples # Maps the parent directory of `.dstack.yml` to `/../examples` + - ~/.cache/huggingface/token # Maps `~/.cache/huggingface/token` to `/root/.cache/huggingface/token` python: 3.12 @@ -539,7 +535,7 @@ commands: - uv pip install trl - | trl sft \ - --model_name_or_path $MODEL --dataset_name $DATASET + --model_name_or_path $MODEL --dataset_name $DATASET \ --num_processes $DSTACK_GPUS_PER_NODE resources: @@ -555,9 +551,9 @@ resources: ### Repos -Sometimes, you may want to mount an entire Git repo inside the container. +Sometimes, you may want to clone an entire Git repo inside the container. -Imagine you have a cloned Git repo containing an `examples` subdirectory with a `.dstack.yml` file: +Imagine you have a Git repo (clonned locally) containing an `examples` subdirectory with a `.dstack.yml` file: @@ -565,11 +561,10 @@ Imagine you have a cloned Git repo containing an `examples` subdirectory with a ```yaml type: task -name: trl-sft +name: trl-sft repos: - # Mounts the parent directory of `examples` (must be a Git repo) - # to `/workflow` (the default working directory) + # Clones the repo from the parent directory (`examples/..`) to `` - .. python: 3.12 @@ -584,7 +579,7 @@ commands: - uv pip install trl - | trl sft \ - --model_name_or_path $MODEL --dataset_name $DATASET + --model_name_or_path $MODEL --dataset_name $DATASET \ --num_processes $DSTACK_GPUS_PER_NODE resources: @@ -593,26 +588,23 @@ resources: -When you run it, `dstack` fetches the repo on the instance, applies your local changes, and mounts it—so the container matches your local repo. +When you run it, `dstack` clones the repo on the instance, applies your local changes, and mounts it—so the container matches your local repo. The local path can be either relative to the configuration file or absolute. ??? info "Repo directory" - By default, `dstack` mounts the repo to `/workflow` (the default working directory). + By default, `dstack` clones the repo to the [working directory](#working-directory). - - You can override the repo directory using either a relative or an absolute path:
```yaml type: task - name: trl-sft + name: trl-sft repos: - # Mounts the parent directory of `examples` (must be a Git repo) - # to `/my-repo` + # Clones the repo in the parent directory (`examples/..`) to `/my-repo` - ..:/my-repo python: 3.12 @@ -627,7 +619,7 @@ The local path can be either relative to the configuration file or absolute. - uv pip install trl - | trl sft \ - --model_name_or_path $MODEL --dataset_name $DATASET + --model_name_or_path $MODEL --dataset_name $DATASET \ --num_processes $DSTACK_GPUS_PER_NODE resources: @@ -636,7 +628,38 @@ The local path can be either relative to the configuration file or absolute.
- If the path is relative, it is resolved against [working directory](#working-directory). + > If the repo directory is relative, it is resolved against [working directory](#working-directory). + + If the repo directory is not empty, the run will fail with a runner error. + To override this behavior, you can set `if_exists` to `skip`: + + ```yaml + type: task + name: trl-sft + + repos: + - local_path: .. + path: /my-repo + if_exists: skip + + python: 3.12 + + env: + - HF_TOKEN + - HF_HUB_ENABLE_HF_TRANSFER=1 + - MODEL=Qwen/Qwen2.5-0.5B + - DATASET=stanfordnlp/imdb + + commands: + - uv pip install trl + - | + trl sft \ + --model_name_or_path $MODEL --dataset_name $DATASET \ + --num_processes $DSTACK_GPUS_PER_NODE + + resources: + gpu: H100:1 + ``` ??? info "Repo size" The repo size is not limited. However, local changes are limited to 2MB. @@ -644,7 +667,7 @@ The local path can be either relative to the configuration file or absolute. You can increase the 2MB limit by setting the `DSTACK_SERVER_CODE_UPLOAD_LIMIT` environment variable. ??? info "Repo URL" - Sometimes you may want to mount a Git repo without cloning it locally. In this case, simply provide a URL in `repos`: + Sometimes you may want to clone a Git repo within the container without cloning it locally. In this case, simply provide a URL in `repos`: @@ -655,7 +678,7 @@ The local path can be either relative to the configuration file or absolute. name: trl-sft repos: - # Clone the specified repo to `/workflow` (the default working directory) + # Clone the repo to `` - https://github.com/dstackai/dstack python: 3.12 @@ -670,7 +693,7 @@ The local path can be either relative to the configuration file or absolute. - uv pip install trl - | trl sft \ - --model_name_or_path $MODEL --dataset_name $DATASET + --model_name_or_path $MODEL --dataset_name $DATASET \ --num_processes $DSTACK_GPUS_PER_NODE resources: @@ -683,9 +706,9 @@ The local path can be either relative to the configuration file or absolute. If a Git repo is private, `dstack` will automatically try to use your default Git credentials (from `~/.ssh/config` or `~/.config/gh/hosts.yml`). - If you want to use custom credentials, you can provide them with [`dstack init`](../reference/cli/dstack/init.md). + > If you want to use custom credentials, you can provide them with [`dstack init`](../reference/cli/dstack/init.md). -> Currently, you can configure up to one repo per run configuration. +Currently, you can configure up to one repo per run configuration. ### Retry policy