diff --git a/concepts/deployment.mdx b/concepts/deployment.mdx index 66ca7a9..9a23eb3 100644 --- a/concepts/deployment.mdx +++ b/concepts/deployment.mdx @@ -125,6 +125,32 @@ models: max_new_tokens: 250 ``` +### Specifying CPU Type + +You can now specify the CPU type to be used during deployment by adding a `cpu_type` field to your YAML, or directly using the `--cpu` flag in the CLI: + +**YAML field:** +```yaml +deployment: !Deployment + destination: aws + endpoint_name: cpu-type-demo + instance_count: 1 + instance_type: ml.m5.xlarge + cpu_type: intel + +models: + - !Model + id: google-bert/bert-base-uncased + source: huggingface +``` + +**CLI flag:** +```sh +magemaker --deploy your-model.yaml --cpu amd +``` + +> **Note:** The `--cpu` flag and corresponding YAML field let you control the CPU architecture (e.g., intel, amd, arm64) for your deployment on supported clouds/instance types. + ## Cloud-Specific Instance Types ### AWS SageMaker Types @@ -205,7 +231,6 @@ Choose your instance type based on your model's requirements: Make sure you setup budget monitory and alerts to avoid unexpected charges. - ## Troubleshooting Deployments Common issues and their solutions: diff --git a/quick-start.mdx b/quick-start.mdx index 5853ef8..00b34d6 100644 --- a/quick-start.mdx +++ b/quick-start.mdx @@ -21,6 +21,15 @@ Supported providers: - `--cloud azure` Azure Machine Learning deployment - `--cloud all` Configure all three providers at the same time +You can also specify compute type with `--instance` and now **`--cpu`** for CPU selection: + +```sh +magemaker --cloud aws --instance ml.m5.xlarge --cpu intel +``` + +- `--instance` Selects the instance type (e.g., `ml.m5.xlarge`, `g2-standard-12`, `Standard_DS3_v2`) +- `--cpu` Specifies CPU type (e.g., `intel`, `amd`) **(New!)** + ### List Models @@ -60,6 +69,7 @@ deployment: !Deployment instance_type: ml.m5.xlarge num_gpus: null quantization: null + cpu: null # optionally specify cpu: intel or amd (new) models: - !Model id: facebook/opt-125m @@ -81,6 +91,7 @@ deployment: !Deployment accelerator_type: NVIDIA_L4 num_gpus: null quantization: null + cpu: null # optionally specify cpu: intel or amd (new) models: - !Model @@ -100,6 +111,7 @@ deployment: !Deployment endpoint_name: facebook-opt-test instance_count: 1 instance_type: Standard_DS3_v2 + cpu: null # optionally specify cpu: intel or amd (new) models: - !Model id: facebook--opt-125m @@ -184,3 +196,4 @@ You can reach us, faizan & jneid, at [support@slashml.com](mailto:support@slashm If anything doesn't make sense or you have suggestions, do point them out at [magemaker.featurebase.app](https://magemaker.featurebase.app/). We'd love to hear from you! We're excited to learn how we can make this more valuable for the community and welcome any and all feedback and suggestions. +