diff --git a/fine-tune.mdx b/fine-tune.mdx index 50045335..4bff8bb0 100644 --- a/fine-tune.mdx +++ b/fine-tune.mdx @@ -88,9 +88,17 @@ For a list of working configuration examples, check out the [Axolotl examples re Your training environment is located in the `/workspace/fine-tuning/` directory and has the following structure: -* `examples/`: Sample configurations and scripts. -* `outputs/`: Where your training results and model outputs will be saved. -* `config.yaml`: The main configuration file for your training parameters. + + + + + + + + + + +`/examples/` contains sample configurations and scripts, `/outputs/` contains your training results and model outputs, and `/config.yaml/` is the main configuration file for your training parameters. The system generates an initial `config.yaml` based on your selected base model and dataset. This is where you define all the hyperparameters for your fine-tuning job. You may need to experiment with these settings to achieve the best results. diff --git a/get-started.mdx b/get-started.mdx index 7fa03c03..0c2f6054 100644 --- a/get-started.mdx +++ b/get-started.mdx @@ -4,7 +4,9 @@ sidebarTitle: "Quickstart" description: "Run code on a remote GPU in minutes." --- -Follow this guide to learn how to create an account, deploy your first GPU Pod, and use it to execute code remotely. +import { PodTooltip, NetworkVolumeTooltip, TemplateTooltip } from "/snippets/tooltips.jsx"; + +Follow this guide to learn how to create an account, deploy your first GPU , and use it to execute code remotely. ## Step 1: Create an account @@ -46,7 +48,7 @@ Take a minute to explore the other tabs: - **Details**: Information about your Pod, such as hardware specs, pricing, and storage. - **Telemetry**: Realtime utilization metrics for your Pod's CPU, memory, and storage. - **Logs**: Logs streamed from your container (including stdout from any applications inside) and the Pod management system. -- **Template Readme**: Details about the template your Pod is running. Your Pod is configured with the latest official Runpod PyTorch template. +- **Template Readme**: Details about the your Pod is running. Your Pod is configured with the latest official Runpod PyTorch template. ## Step 4: Execute code on your Pod with JupyterLab @@ -55,7 +57,9 @@ Take a minute to explore the other tabs: 3. Type `print("Hello, world!")` in the first line of the notebook. 4. Click the play button to run your code. -And that's it—congrats! You just ran your first line of code on Runpod. + +Congratulations! You just ran your first line of code on Runpod. + ## Step 5: Clean up @@ -74,7 +78,7 @@ To terminate your Pod: -Terminating a Pod permanently deletes all data that isn't stored in a [network volume](/storage/network-volumes). Be sure that you've saved any data you might need to access again. +Terminating a Pod permanently deletes all data that isn't stored in a . Be sure that you've saved any data you might need to access again. To learn more about how storage works, see the [Pod storage overview](/pods/storage/types). diff --git a/get-started/api-keys.mdx b/get-started/api-keys.mdx index 4d2c39ee..8cf82ae5 100644 --- a/get-started/api-keys.mdx +++ b/get-started/api-keys.mdx @@ -3,6 +3,8 @@ title: "Manage API keys" description: "Learn how to create, edit, and disable Runpod API keys." --- +import { ServerlessTooltip } from "/snippets/tooltips.jsx"; + Legacy API keys generated before November 11, 2024 have either Read/Write or Read Only access to GraphQL based on what was set for that key. All legacy keys have full access to AI API. To improve security, generate a new key with **Restricted** permission and select the minimum permission needed for your use case. @@ -20,7 +22,7 @@ Follow these steps to create a new Runpod API key: 3. Give your key a name and set its permissions (**All**, **Restricted**, or **Read Only**). If you choose **Restricted**, you can customize access for each Runpod API: * **None**: No access - * **Restricted**: Customize access for each of your [Serverless endpoints](/serverless/overview). (Default: None.) + * **Restricted**: Customize access for each of your endpoints. (Default: None.) * **Read/Write**: Full access to your endpoints. * **Read Only**: Read access without write access. diff --git a/get-started/concepts.mdx b/get-started/concepts.mdx index a498d051..0e81da43 100644 --- a/get-started/concepts.mdx +++ b/get-started/concepts.mdx @@ -3,6 +3,8 @@ title: "Concepts" description: "Key concepts and terminology for understanding Runpod's platform and products." --- +import { PodsTooltip, ServerlessTooltip } from "/snippets/tooltips.jsx"; + ## [Runpod console](https://console.runpod.io) The web interface for managing your compute resources, account, teams, and billing. @@ -25,7 +27,7 @@ A managed compute cluster with high-speed networking for multi-node distributed ## [Network volume](/storage/network-volumes) -Persistent storage that exists independently of your other compute resources and can be attached to multiple Pods or Serverless endpoints to share data between machines. +Persistent storage that exists independently of your other compute resources and can be attached to multiple or endpoints to share data between machines. ## [S3-compatible API](/storage/s3-api) diff --git a/get-started/connect-to-runpod.mdx b/get-started/connect-to-runpod.mdx index a188cdc0..aa292281 100644 --- a/get-started/connect-to-runpod.mdx +++ b/get-started/connect-to-runpod.mdx @@ -3,11 +3,13 @@ title: "Choose a workflow" description: "Review the available methods for accessing and managing Runpod resources." --- +import { PodsTooltip, EndpointTooltip, ServerlessTooltip } from "/snippets/tooltips.jsx"; + Runpod offers multiple ways to access and manage your compute resources. Choose the method that best fits your workflow: ## Runpod console -The Runpod console provides an intuitive web interface to manage Pods and endpoints, access Pod terminals, send endpoint requests, monitor resource usage, and view billing and usage history. +The Runpod console provides an intuitive web interface to manage and s, access Pod terminals, send endpoint requests, monitor resource usage, and view billing and usage history. [Launch the Runpod console →](https://www.console.runpod.io) @@ -19,7 +21,7 @@ You can connect directly to your running Pods and execute code on them using a v ## REST API -The Runpod REST API allows you to programmatically manage and control compute resources. Use the API to manage Pod lifecycles and Serverless endpoints, monitor resource utilization, and integrate Runpod into your applications. +The Runpod REST API allows you to programmatically manage and control compute resources. Use the API to manage Pod lifecycles and endpoints, monitor resource utilization, and integrate Runpod into your applications. [Explore the API reference →](/api-reference/docs/GET/openapi-json) diff --git a/get-started/manage-accounts.mdx b/get-started/manage-accounts.mdx index 6e89f0b0..cf876ffa 100644 --- a/get-started/manage-accounts.mdx +++ b/get-started/manage-accounts.mdx @@ -3,13 +3,15 @@ title: "Manage accounts" description: "Create accounts, manage teams, and configure user permissions in Runpod." --- +import { PodsTooltip, ServerlessTooltip, InstantClusterTooltip, NetworkVolumeTooltip } from "/snippets/tooltips.jsx"; + To access Runpod resources, you need to either create your own account or join an existing team through an invitation. This guide explains how to set up and manage accounts, teams, and user roles. ## Create an account Sign up for a Runpod account at [console.runpod.io/signup](https://www.console.runpod.io/signup). -Once created, you can use your account to deploy Pods, create Serverless endpoints, and access other Runpod services. Personal accounts can be converted to team accounts at any time to enable collaboration features. +Once created, you can use your account to deploy , create endpoints, and access other Runpod services. Personal accounts can be converted to team accounts at any time to enable collaboration features. ## Convert to a team account diff --git a/get-started/products.mdx b/get-started/products.mdx index f2f72d8c..29b78f0c 100644 --- a/get-started/products.mdx +++ b/get-started/products.mdx @@ -4,7 +4,9 @@ sidebarTitle: "Product overview" description: "Explore Runpod's major offerings and find the right solution for your workload." --- -Runpod offers cloud computing resources for AI and machine learning workloads. You can choose from instant GPUs for development, auto-scaling Serverless computing, pre-deployed AI models, or multi-node clusters for distributed training. +import { ServerlessTooltip, PodsTooltip, PublicEndpointTooltip, InstantClusterTooltip, WorkerTooltip } from "/snippets/tooltips.jsx"; + +Runpod offers cloud computing resources for AI and machine learning workloads. You can choose from instant GPUs for development, auto-scaling computing, pre-deployed AI models, or multi-node clusters for distributed training. ## [Serverless](/serverless/overview) @@ -12,15 +14,15 @@ Serverless provides pay-per-second computing with automatic scaling for producti ## [Pods](/pods/overview) -Pods give you dedicated GPU or CPU instances for containerized workloads. Pods are billed by the minute and stay available as long as you keep them running, making them perfect for development, training, and workloads that need continuous access. + give you dedicated GPU or CPU instances for containerized workloads. Pods are billed by the minute and stay available as long as you keep them running, making them perfect for development, training, and workloads that need continuous access. ## [Public Endpoints](/hub/public-endpoints) -Public Endpoints provide instant API access to pre-deployed AI models for image, video, and text generation without any setup. You only pay for what you generate, making it easy to integrate AI into your applications without managing infrastructure. +s provide instant API access to pre-deployed AI models for image, video, and text generation without any setup. You only pay for what you generate, making it easy to integrate AI into your applications without managing infrastructure. ## [Instant Clusters](/instant-clusters) -Instant Clusters deliver fully managed multi-node compute clusters for large-scale distributed workloads. With high-speed networking between nodes, you can run multi-node training, fine-tune large language models, and handle other tasks that require multiple GPUs working in parallel. +s deliver fully managed multi-node compute clusters for large-scale distributed workloads. With high-speed networking between nodes, you can run multi-node training, fine-tune large language models, and handle other tasks that require multiple GPUs working in parallel. ## Choosing the right option diff --git a/hub/overview.mdx b/hub/overview.mdx index 0cfb08bc..1b439e1b 100644 --- a/hub/overview.mdx +++ b/hub/overview.mdx @@ -4,7 +4,9 @@ sidebarTitle: "Overview" description: "Discover, deploy, and share preconfigured AI repos using the Runpod Hub." --- -The [Runpod Hub](https://console.runpod.io/hub) is a centralized repository that enables users to discover, share, and deploy preconfigured AI repos optimized for Runpod's [Serverless](/serverless/overview/) and [Pod](/pods/overview) infrastructure. It offers a catalog of vetted, open-source repositories that can be deployed with minimal setup, creating a collaborative ecosystem for AI developers and users. +import { ServerlessTooltip, PodTooltip, EndpointTooltip, PublicEndpointTooltip, HandlerFunctionTooltip, WorkerTooltip } from "/snippets/tooltips.jsx"; + +The [Runpod Hub](https://console.runpod.io/hub) is a centralized repository that enables users to discover, share, and deploy preconfigured AI repos optimized for Runpod's and infrastructure. It offers a catalog of vetted, open-source repositories that can be deployed with minimal setup, creating a collaborative ecosystem for AI developers and users. Whether you're a developer looking to share your work or a user seeking preconfigured solutions, the Hub makes discovering and deploying AI projects seamless and efficient. @@ -32,7 +34,7 @@ The Hub simplifies the entire lifecycle of repo sharing and deployment, from ini ## Public Endpoints -In addition to official and community-submitted repos, the Hub also offers [Public Endpoints](/hub/public-endpoints) for popular AI models. These are ready-to-use APIs that you can integrate directly into your applications without needing to manage any of the underlying infrastructure. +In addition to official and community-submitted repos, the Hub also offers s for popular AI models. These are ready-to-use APIs that you can integrate directly into your applications without needing to manage any of the underlying infrastructure. Public Endpoints provide: @@ -63,7 +65,7 @@ You can deploy a repo from the Hub in seconds, choosing between Serverless endpo 4. Click the **Deploy** button in the top-right of the repo page. You can also use the dropdown menu to deploy an older version. 5. Click **Create Endpoint** -Within minutes you'll have access to a new Serverless endpoint, ready for integration with your applications or experimentation. +Within minutes you'll have access to a new Serverless , ready for integration with your applications or experimentation. ### Deploy as a Pod @@ -96,7 +98,7 @@ Where `POD_ID` is your Pod's actual ID. ## Publish your own repo -You can [publish your own repo](/hub/publishing-guide) on the Hub by preparing your GitHub repository with a working [Serverless endpoint](/serverless/overview) implementation, comprised of a [worker handler function](/serverless/workers/handler-functions) and `Dockerfile`. +You can [publish your own repo](/hub/publishing-guide) on the Hub by preparing your GitHub repository with a working Serverless endpoint implementation, comprised of a and `Dockerfile`. To learn how to build your first worker, [follow this guide](/serverless/workers/custom-worker). diff --git a/instant-clusters.mdx b/instant-clusters.mdx index b3a7c9f2..8411979c 100644 --- a/instant-clusters.mdx +++ b/instant-clusters.mdx @@ -4,6 +4,8 @@ sidebarTitle: "Overview" description: "Fully managed compute clusters for multi-node training and AI inference." --- +import { DataCenterTooltip, PyTorchTooltip } from "/snippets/tooltips.jsx"; + Runpod offers custom Instant Cluster pricing plans for large scale and enterprise workloads. If you're interested in learning more, [contact our sales team](https://ecykq.share.hsforms.com/2MZdZATC3Rb62Dgci7knjbA). @@ -37,7 +39,7 @@ Instant Clusters feature high-speed local networking for efficient data movement * Most clusters include 3200 Gbps networking. * A100 clusters offer up to 1600 Gbps networking. -This fast networking enables efficient scaling of distributed training and inference workloads. Runpod ensures nodes selected for clusters are within the same data center for optimal performance. +This fast networking enables efficient scaling of distributed training and inference workloads. Runpod ensures nodes selected for clusters are within the same for optimal performance. ## Zero configuration @@ -45,7 +47,7 @@ Runpod automates cluster setup so you can focus on your workloads: * Clusters are pre-configured with static IP address management. * All necessary [environment variables](#environment-variables) for distributed training are pre-configured. -* Supports popular frameworks like PyTorch, TensorFlow, and Slurm. +* Supports popular frameworks like , TensorFlow, and Slurm. ## Get started diff --git a/instant-clusters/axolotl.mdx b/instant-clusters/axolotl.mdx index a4075e24..89f89ea5 100644 --- a/instant-clusters/axolotl.mdx +++ b/instant-clusters/axolotl.mdx @@ -86,7 +86,9 @@ After running the command on the last Pod, you should see output similar to this [2025-04-01 19:24:22,603] [INFO] [axolotl.train.save_trained_model:211] [PID:1009] [RANK:0] Training completed! Saving pre-trained model to ./outputs/lora-out. ``` -Congrats! You've successfully trained a model using Axolotl on an Instant Cluster. Your fine-tuned model has been saved to the `./outputs/lora-out` directory. You can now use this model for inference or continue training with different parameters. + +Congratulations! You've successfully trained a model using Axolotl on an Instant Cluster. Your fine-tuned model has been saved to the `./outputs/lora-out` directory. You can now use this model for inference or continue training with different parameters. + ## Step 4: Clean up diff --git a/instant-clusters/pytorch.mdx b/instant-clusters/pytorch.mdx index 3b460092..ad50b01d 100644 --- a/instant-clusters/pytorch.mdx +++ b/instant-clusters/pytorch.mdx @@ -3,7 +3,9 @@ title: "Deploy an Instant Cluster with PyTorch" sidebarTitle: "PyTorch" --- -This tutorial demonstrates how to use Instant Clusters with [PyTorch](http://pytorch.org) to run distributed workloads across multiple GPUs. By leveraging PyTorch's distributed processing capabilities and Runpod's high-speed networking infrastructure, you can significantly accelerate your training process compared to single-GPU setups. +import { PyTorchTooltip } from "/snippets/tooltips.jsx"; + +This tutorial demonstrates how to use Instant Clusters with to run distributed workloads across multiple GPUs. By leveraging PyTorch's distributed processing capabilities and Runpod's high-speed networking infrastructure, you can significantly accelerate your training process compared to single-GPU setups. Follow the steps below to deploy a cluster and start running distributed PyTorch workloads efficiently. diff --git a/pods/choose-a-pod.mdx b/pods/choose-a-pod.mdx index 02b8c47f..66db930f 100644 --- a/pods/choose-a-pod.mdx +++ b/pods/choose-a-pod.mdx @@ -4,6 +4,8 @@ description: "Select the right Pod by evaluating your resource requirements." sidebar_position: 3 --- +import { CUDATooltip } from "/snippets/tooltips.jsx"; + Selecting the appropriate Pod configuration is a crucial step in maximizing performance and efficiency for your specific workloads. This guide will help you understand the key factors to consider when choosing a Pod that meets your requirements. ## Understanding your workload needs @@ -28,7 +30,7 @@ There are several online tools that can help you estimate your resource requirem ### GPU selection -The GPU is the cornerstone of computational performance for many workloads. When selecting your GPU, consider the architecture that best suits your software requirements. NVIDIA GPUs with CUDA support are essential for most machine learning frameworks, while some applications might perform better on specific GPU generations. Evaluate both the raw computing power (CUDA cores, tensor cores) and the memory bandwidth to ensure optimal performance for your specific tasks. +The GPU is the cornerstone of computational performance for many workloads. When selecting your GPU, consider the architecture that best suits your software requirements. NVIDIA GPUs with support are essential for most machine learning frameworks, while some applications might perform better on specific GPU generations. Evaluate both the raw computing power (CUDA cores, tensor cores) and the memory bandwidth to ensure optimal performance for your specific tasks. For machine learning inference, a mid-range GPU might be sufficient, while training large models requires more powerful options. Check framework-specific recommendations, as PyTorch, TensorFlow, and other frameworks may perform differently across GPU types. diff --git a/pods/manage-pods.mdx b/pods/manage-pods.mdx index cf349f88..048e5420 100644 --- a/pods/manage-pods.mdx +++ b/pods/manage-pods.mdx @@ -3,6 +3,8 @@ title: "Manage Pods" description: "Create, start, stop, and terminate Pods using the Runpod console or CLI." --- +import { MachineTooltip, TemplatesTooltip } from "/snippets/tooltips.jsx"; + ## Before you begin If you want to manage Pods using the Runpod CLI, you'll need to [install Runpod CLI](/runpodctl/overview), and set your [API key](/get-started/api-keys) in the configuration. @@ -39,7 +41,7 @@ GPU configuration: **CUDA Version Compatibility** -When using templates (especially community templates like `runpod/pytorch:2.8.0-py3.11-cuda12.8.1-cudnn-devel-ubuntu22.04`), ensure the host machine's CUDA driver version matches or exceeds the template's requirements. +When using (especially community templates like `runpod/pytorch:2.8.0-py3.11-cuda12.8.1-cudnn-devel-ubuntu22.04`), ensure the CUDA version of the host matches or exceeds the template's requirements. If you encounter errors like "OCI runtime create failed" or "unsatisfied condition: cuda>=X.X", you need to filter for compatible machines: @@ -259,11 +261,11 @@ pod "wu5ekmn69oh1xr" started with $0.290 / hr ## Terminate a Pod - + Terminating a Pod permanently deletes all associated data that isn't stored in a [network volume](/storage/network-volumes). Be sure to export or download any data that you'll need to access again. - + diff --git a/pods/overview.mdx b/pods/overview.mdx index c20b0e9f..74a85039 100644 --- a/pods/overview.mdx +++ b/pods/overview.mdx @@ -3,6 +3,8 @@ title: Overview description: "Get on-demand access to powerful computing resources." --- +import { NetworkVolumeTooltip, ContainerDiskTooltip, VolumeDiskTooltip, ServerlessTooltip, RunpodHubTooltip, GlobalNetworkingTooltip, RunpodCLITooltip, TemplatesTooltip } from "/snippets/tooltips.jsx"; + @@ -28,7 +30,7 @@ Each Pod consists of these core components: ## Pod templates -[Pod templates](/pods/templates/overview) are pre-configured Docker image setups that let you quickly spin up Pods without manual environment configuration. They're essentially deployment configurations that include specific models, frameworks, or workflows bundled together. +Pod are pre-configured Docker image setups that let you quickly spin up Pods without manual environment configuration. They're essentially deployment configurations that include specific models, frameworks, or workflows bundled together. Templates eliminate the need to manually set up environments, saving time and reducing configuration errors. For example, instead of installing PyTorch, configuring JupyterLab, and setting up all dependencies yourself, you can select an official Runpod PyTorch template and have everything ready to go instantly. @@ -38,11 +40,11 @@ To learn how to create your own custom templates, see [Build a custom Pod templa Pods offer three types of storage to match different use cases: -Every Pod comes with a resizable **container disk** that houses the operating system and stores temporary files, which are cleared after the Pod stops. +Every Pod comes with a resizable that houses the operating system and stores temporary files, which are cleared after the Pod stops. -**Volume disks** provide persistent storage that is preserved throughout the Pod's lease, functioning like a dedicated hard drive. Data stored in the volume disk directory (`/workspace` by default) persists when you stop the Pod, but is erased when the Pod is deleted. +By contrast, s provide persistent storage that is preserved throughout the Pod's lease, functioning like a dedicated hard drive. Data stored in the volume disk directory (`/workspace` by default) persists when you stop the Pod, but is erased when the Pod is deleted. -Optional [network volumes](/storage/network-volumes) provide more flexible permanent storage that can be transferred between Pods, replacing the volume disk when attached. When using a Pod with network volume attached, you can safely delete your Pod without losing the data stored in your network volume directory (`/workspace` by default). +Optional s provide more flexible permanent storage that can be transferred between Pods, replacing the volume disk when attached. When using a Pod with network volume attached, you can safely delete your Pod without losing the data stored in your network volume directory (`/workspace` by default). To learn more, see [Storage options](/pods/storage/types). @@ -53,7 +55,13 @@ You can deploy Pods in several ways: - [From a template](/pods/templates/overview): Pre-configured environments for quick setup of common workflows. - **Custom containers**: Pull from any compatible container registry such as Docker Hub, GitHub Container Registry, or Amazon ECR. - **Custom images**: Build and deploy your own container images. -- [From Serverless repos](/hub/overview#deploy-as-a-pod): Deploy any Serverless-compatible repository from the [Runpod Hub](/hub/overview) directly as a Pod, providing a cost-effective option for consistent workloads. +- [From Serverless repos](/hub/overview#deploy-as-a-pod): Deploy any -compatible repository from the directly as a Pod, providing a cost-effective option for consistent workloads. + + + +When building a container image for Runpod on a Mac (Apple Silicon), use the flag `--platform linux/amd64` to ensure your image is compatible with the platform. + + ## Connecting to your Pod @@ -119,3 +127,4 @@ Ready to get started? Explore these pages to learn more: * Configure [global networking](/pods/networking) for your applications. * [Set up Ollama on a Pod](/tutorials/pods/run-ollama) to run LLM inference with HTTP API access. * [Build Docker images with Bazel](/tutorials/pods/build-docker-images) to emulate a Docker-in-Docker workflow. + diff --git a/pods/pricing.mdx b/pods/pricing.mdx index 77bfcd78..bbfc9b65 100644 --- a/pods/pricing.mdx +++ b/pods/pricing.mdx @@ -4,6 +4,8 @@ sidebarTitle: "Pricing" description: "Explore pricing options for Pods, including on-demand, savings plans, and spot instances." --- +import { MachineTooltip } from "/snippets/tooltips.jsx"; + Runpod offers custom pricing plans for large scale and enterprise workloads. If you're interested in learning more, [contact our sales team](https://ecykq.share.hsforms.com/2MZdZATC3Rb62Dgci7knjbA). @@ -148,7 +150,7 @@ Runpod offers [three types of storage](/pods/storage/types) for Pods:: - **Disk volumes:** Persistent storage that is billed at \$0.10 per GB per month on running Pods and \$0.20 per GB per month for volume storage on stopped Pods. Billed per-second. - **Network volumes:** External storage that is billed at \$0.07 per GB per month for storage requirements below 1TB. For requirements exceeding 1TB, the rate is \$0.05 per GB per month. Billed hourly. -You are not charged for storage if the host machine is down or unavailable from the public internet. +You are not charged for storage if the host is down or unavailable from the public internet. Container and volume disk storage will be included in your Pod's displayed hourly cost during deployment. diff --git a/pods/storage/types.mdx b/pods/storage/types.mdx index ce5f97f3..218fcdcb 100644 --- a/pods/storage/types.mdx +++ b/pods/storage/types.mdx @@ -3,13 +3,15 @@ title: "Storage options" description: "Choose the right type of storage for your Pods." --- -Choosing the right type of storage is crucial for optimizing your workloads, whether you need temporary storage for active computations, persistent storage for long-term data retention, or permanent, shareable storage across multiple Pods. +import { PodsTooltip, ContainerDiskTooltip, VolumeDiskTooltip, NetworkVolumeTooltip, PodTooltip } from "/snippets/tooltips.jsx"; + +Choosing the right type of storage is crucial for optimizing your workloads, whether you need temporary storage for active computations, persistent storage for long-term data retention, or permanent, shareable storage across multiple . This page describes the different types of storage options available for your Pods, and when to use each in your workflow. ## Container disk -A container disk houses the operating system and provides temporary storage for a Pod. It's created when a Pod is launched and is directly tied to the Pod's lifecycle. +A container disk houses the operating system and provides temporary storage for a . It's created when a Pod is launched and is directly tied to the Pod's lifecycle. ## Volume disk @@ -19,9 +21,9 @@ The volume disk is mounted at `/workspace` by default (this will be replaced by ## Network volume -[Network volumes](/storage/network-volumes) offer persistent storage similar to the volume disk, but with the added benefit that they can be attached to multiple Pods, and that they persist independently from the Pod's lifecycle. This allows you to share and access data across multiple instances or transfer storage between machines, and retain data even after a Pod is deleted. - -When attached to a Pod, a network volume replaces the volume disk, and by default they are similarly mounted at `/workspace`. +[Network volumes](/storage/network-volumes) offer persistent storage that can be attached to multiple Pods and persists independently from the Pod's lifecycle. This allows you to share and access data across multiple instances or transfer storage between machines, and retain data even after a Pod is deleted. +` +When attached to a Pod, a network volume replaces the volume disk, and by default it is mounted at `/workspace`. diff --git a/pods/templates/create-custom-template.mdx b/pods/templates/create-custom-template.mdx index 4276f31e..dfbf96e9 100644 --- a/pods/templates/create-custom-template.mdx +++ b/pods/templates/create-custom-template.mdx @@ -5,11 +5,13 @@ description: "A step-by-step guide to extending Runpod's official templates." tag: "NEW" --- +import { PodTooltip, PodsTooltip, PyTorchTooltip, CUDATooltip, TemplateTooltip } from "/snippets/tooltips.jsx"; + You can find the complete code for this tutorial, including automated build options with GitHub Actions, in the [runpod-workers/pod-template](https://github.com/runpod-workers/pod-template) repository. -This tutorial shows how to build a custom Pod template from the ground up. You'll extend an official Runpod template, add your own dependencies, configure how your container starts, and pre-load machine learning models. This approach saves time during Pod initialization and ensures consistent environments across deployments. +This tutorial shows how to build a custom from the ground up. You'll extend an official Runpod template, add your own dependencies, configure how your container starts, and pre-load machine learning models. This approach saves time during Pod initialization and ensures consistent environments across deployments. By creating custom templates, you can package everything your project needs into a reusable Docker image. Once built, you can deploy your workload in seconds instead of reinstalling dependencies every time you start a new Pod. You can also share your template with members of your team and the wider Runpod community. @@ -57,18 +59,19 @@ touch Dockerfile requirements.txt main.py Your project structure should now look like this: -``` -my-custom-pod-template/ -├── Dockerfile -├── requirements.txt -└── main.py -``` + + + + + + + ## Step 2: Choose a base image and create your Dockerfile -Runpod offers base images with PyTorch, CUDA, and common dependencies pre-installed. You'll extend one of these images to build your custom template. +Runpod offers base images with , , and common dependencies pre-installed. You'll extend one of these images to build your custom template. @@ -515,7 +518,9 @@ To avoid incurring unnecessary charges, make sure to stop and then terminate you ## Next steps + Congratulations! You've built a custom Pod template and deployed it to Runpod. + You can use this as a jumping off point to build your own custom templates with your own applications, dependencies, and models. diff --git a/pods/templates/environment-variables.mdx b/pods/templates/environment-variables.mdx index d5c3f191..ec0a53bf 100644 --- a/pods/templates/environment-variables.mdx +++ b/pods/templates/environment-variables.mdx @@ -3,7 +3,9 @@ title: "Environment variables" description: "Learn how to use environment variables in Runpod Pods for configuration, security, and automation" --- -Environment variables in are key-value pairs that you can configure for your Pods. They are accessible within your containerized application and provide a flexible way to pass configuration settings, secrets, and runtime information to your application without hardcoding them into your code or container image. +import { PodTooltip, PodsTooltip } from "/snippets/tooltips.jsx"; + +Environment variables are key-value pairs that you can configure for your . They are accessible within your containerized application and provide a flexible way to pass configuration settings, secrets, and runtime information to your application without hardcoding them into your code or container image. ## What are environment variables? diff --git a/pods/templates/manage-templates.mdx b/pods/templates/manage-templates.mdx index de6e2459..d723f668 100644 --- a/pods/templates/manage-templates.mdx +++ b/pods/templates/manage-templates.mdx @@ -3,7 +3,9 @@ title: "Manage Pod templates" description: "Learn how to create, and manage custom Pod templates." --- -Creating a custom template allows you to package your specific configuration for reuse and sharing. Templates define all the necessary components to launch a Pod with your desired setup. +import { PodTooltip, PodEnvironmentVariablesTooltip } from "/snippets/tooltips.jsx"; + +Creating a custom template allows you to package your specific configuration for reuse and sharing. Templates define all the necessary components to launch a with your desired setup. ## Template configuration options @@ -102,7 +104,7 @@ For more details, see the [API reference](/api-reference/templates/POST/template ## Using environment variables in templates -Environment variables provide a flexible way to configure your Pod's runtime behavior without modifying the container image. + provide a flexible way to configure your Pod's runtime behavior without modifying the container image. ### Defining environment variables diff --git a/pods/templates/overview.mdx b/pods/templates/overview.mdx index d95e57e9..e71f9d19 100644 --- a/pods/templates/overview.mdx +++ b/pods/templates/overview.mdx @@ -3,7 +3,9 @@ title: "Overview" description: "Streamline your Pod deployments with templates, bundling prebuilt container images with hardware specs and network settings." --- -Pod templates are pre-configured Docker image setups that let you quickly spin up Pods without manual environment configuration. They're essentially deployment configurations that include specific models, frameworks, or workflows bundled together. +import { PodTooltip, PodsTooltip, PodEnvironmentVariablesTooltip } from "/snippets/tooltips.jsx"; + + templates are pre-configured Docker image setups that let you quickly spin up Pods without manual environment configuration. They're essentially deployment configurations that include specific models, frameworks, or workflows bundled together. Templates eliminate the need to manually set up environments, saving time and reducing configuration errors. For example, instead of installing PyTorch, configuring JupyterLab, and setting up all dependencies yourself, you can select a pre-configured template and have everything ready to go instantly. @@ -23,7 +25,7 @@ Pod templates contain all the necessary components to launch a fully configured - **Container image:** The Docker image with all necessary software packages and dependencies. This is where the core functionality of the template is stored, i.e., the software package and any files associated with it. - **Hardware specifications:** Container disk size, volume size, and mount paths that define the storage requirements for your Pod. - **Network settings:** Exposed ports for services like web UIs or APIs. If the image has a server associated with it, you'll want to ensure that the HTTP and TCP ports are exposed as necessary. -- **Environment variables:** Pre-configured settings specific to the template that customize the behavior of the containerized application. +- **:** Pre-configured settings specific to the template that customize the behavior of the containerized application. - **Startup commands:** Instructions that run when the Pod launches, allowing you to customize the initialization process. ## Types of templates diff --git a/pods/templates/secrets.mdx b/pods/templates/secrets.mdx index 3ee206c8..2576dae7 100644 --- a/pods/templates/secrets.mdx +++ b/pods/templates/secrets.mdx @@ -3,7 +3,9 @@ title: "Manage secrets" description: "Securely store and manage sensitive information like API keys, passwords, and tokens with Runpod secrets." --- -This guide shows how to create, view, edit, delete, and use secrets in your [Pod templates](/pods/templates/overview) to protect sensitive data and improve security. +import { PodTooltip, PodsTooltip, TemplatesTooltip } from "/snippets/tooltips.jsx"; + +This guide shows how to create, view, edit, delete, and use secrets in your to protect sensitive data and improve security. ## What are Runpod secrets diff --git a/references/billing-information.mdx b/references/billing-information.mdx index 6d7569ee..14d4402d 100644 --- a/references/billing-information.mdx +++ b/references/billing-information.mdx @@ -3,6 +3,8 @@ title: "Billing information" description: "Understand how billing works for Pods, storage, network volumes, refunds, and spending limits." --- +import { MachineTooltip } from "/snippets/tooltips.jsx"; + All billing, including per-hour compute and storage billing, is charged per minute. ## How billing works @@ -19,7 +21,7 @@ You must have at least one hour's worth of runtime in your balance to rent a Pod Storage billing varies depending on Pod state. Running Pods are charged \$0.10 per GB per month for all storage, while stopped Pods are charged \$0.20 per GB per month for volume storage. -Storage is charged per minute. You are not charged for storage if the host machine is down or unavailable from the public internet. +Storage is charged per minute. You are not charged for storage if the host is down or unavailable from the public internet. ## Network volume billing diff --git a/references/troubleshooting/pod-migration.mdx b/references/troubleshooting/pod-migration.mdx index 4a5d4953..499a86f6 100644 --- a/references/troubleshooting/pod-migration.mdx +++ b/references/troubleshooting/pod-migration.mdx @@ -4,11 +4,13 @@ description: "Automatically migrate your Pod to a new machine when your GPU is u tag: "BETA" --- +import { MachineTooltip } from "/snippets/tooltips.jsx"; + Pod migration is currently in beta. [Join our Discord](https://discord.gg/runpod) if you'd like to provide feedback. -When you start a Pod, it's assigned to a specific physical machine with 4-8 GPUs. This creates a link between your Pod and that particular machine. As long as your Pod is running, that GPU is exclusively reserved for you, which ensures stable pricing and prevents your work from being interrupted. +When you start a Pod, it's assigned to a specific physical with 4-8 GPUs. This creates a link between your Pod and that particular machine. As long as your Pod is running, that GPU is exclusively reserved for you, which ensures stable pricing and prevents your work from being interrupted. When you stop a Pod, you release that specific GPU, allowing other users to rent it. If another user rents the GPU while your Pod is stopped, the GPU will be occupied when you try to restart. Because your Pod is still tied to that original machine, you'll see message asking you to migrate your Pod. This doesn't mean there are no GPUs of that type available on Runpod, just that none are available on the specific physical machine where your Pod's data is stored. diff --git a/references/troubleshooting/zero-gpus.mdx b/references/troubleshooting/zero-gpus.mdx index 2b3e2c3a..99a4e221 100644 --- a/references/troubleshooting/zero-gpus.mdx +++ b/references/troubleshooting/zero-gpus.mdx @@ -4,7 +4,9 @@ sidebarTitle: "Zero GPU Pods" description: "What to do when your Pod machine has zero GPUs." --- -When you restart a stopped Pod, you might see a message telling you that there are "Zero GPU Pods." This is because there are no GPUs available on the machine where your Pod was running. +import { MachineTooltip } from "/snippets/tooltips.jsx"; + +When you restart a stopped Pod, you might see a message telling you that there are "Zero GPU Pods." This is because there are no GPUs available on the where your Pod was running. ## Why does this happen? diff --git a/runpodctl/overview.mdx b/runpodctl/overview.mdx index 1c7e107b..0b64eb41 100644 --- a/runpodctl/overview.mdx +++ b/runpodctl/overview.mdx @@ -4,11 +4,13 @@ sidebarTitle: "Overview" description: "Use Runpod CLI to manage Pods from your local machine." --- -Runpod CLI is an [open source](https://github.com/runpod/runpodctl) command-line interface tool for managing your Runpod resources remotely from your local machine. You can transfer files and data between your local system and Runpod, execute code on remote Pods, and automate Pod deployment workflows. +import { PodsTooltip, PodTooltip } from "/snippets/tooltips.jsx"; + +Runpod CLI is an [open source](https://github.com/runpod/runpodctl) command-line interface tool for managing your Runpod resources remotely from your local machine. You can transfer files and data between your local system and Runpod, execute code on remote , and automate Pod deployment workflows. ## Install Runpod CLI locally -Every Pod you deploy comes preinstalled with the `runpodctl` command and a Pod-scoped API key. You can also install it on your local machine to manage your Pods remotely. +Every you deploy comes preinstalled with the `runpodctl` command and a Pod-scoped API key. You can also install it on your local machine to manage your Pods remotely from your own system. To install Runpod CLI locally, follow these steps: diff --git a/serverless/development/dual-mode-worker.mdx b/serverless/development/dual-mode-worker.mdx index a2a3718f..a22760d5 100644 --- a/serverless/development/dual-mode-worker.mdx +++ b/serverless/development/dual-mode-worker.mdx @@ -37,12 +37,16 @@ cd dual-mode-worker touch handler.py start.sh Dockerfile requirements.txt ``` -This creates: +This creates the following project structure: -- `handler.py`: Your Python script with the Runpod handler logic. -- `start.sh`: A shell script that will be the entrypoint for your Docker container. -- `Dockerfile`: Instructions to build your Docker image. -- `requirements.txt`: A file to list Python dependencies. + + + + + + + + ## Step 2: Create the handler @@ -376,7 +380,11 @@ After a few moments for initialization and processing, you should see output sim ## Explore the Pod-first development workflow -Congratulations! You've successfully built, deployed, and tested a dual-mode Serverless worker. Now, let's explore the recommended iteration process for a Pod-first development workflow: + +Congratulations! You've successfully built, deployed, and tested a dual-mode Serverless worker. + + +Now, let's explore the recommended iteration process for a Pod-first development workflow: diff --git a/serverless/development/optimization.mdx b/serverless/development/optimization.mdx index ac69817f..a2b8512b 100644 --- a/serverless/development/optimization.mdx +++ b/serverless/development/optimization.mdx @@ -4,6 +4,8 @@ sidebarTitle: "Optimization guide" description: "Implement strategies to reduce latency and cost for your Serverless endpoints." --- +import { MachineTooltip } from "/snippets/tooltips.jsx"; + Optimizing your Serverless endpoints involves a cycle of measuring performance with [benchmarking](/serverless/development/benchmarking), identifying bottlenecks, and tuning your [endpoint configurations](/serverless/endpoints/endpoint-configurations). This guide covers specific strategies to reduce startup times and improve throughput. ## Optimization overview @@ -14,7 +16,7 @@ To ensure high availability during peak traffic, you should select multiple GPU For latency-sensitive applications, utilizing active workers is the most effective way to eliminate cold starts. You should also configure your [max workers](/serverless/endpoints/endpoint-configurations#max-workers) setting with approximately 20% headroom above your expected concurrency. This buffer ensures that your endpoint can handle sudden load spikes without throttling requests or hitting capacity limits. -Your architectural choices also significantly impact performance. Whenever possible, bake your models directly into the Docker image to leverage the high-speed local NVMe storage of the host machine. If you utilize [network volumes](/storage/network-volumes) for larger datasets, remember that this restricts your endpoint to specific data centers, which effectively shrinks your pool of available compute resources. +Your architectural choices also significantly impact performance. Whenever possible, bake your models directly into the Docker image to leverage the high-speed local NVMe storage of the host . If you utilize [network volumes](/storage/network-volumes) for larger datasets, remember that this restricts your endpoint to specific data centers, which effectively shrinks your pool of available compute resources. ## Reducing worker startup times diff --git a/serverless/development/overview.mdx b/serverless/development/overview.mdx index 7dee93c5..7dcc8d18 100644 --- a/serverless/development/overview.mdx +++ b/serverless/development/overview.mdx @@ -4,6 +4,8 @@ sidebarTitle: "Overview" description: "Test, debug, and optimize your Serverless applications." --- +import { ServerlessEnvironmentVariablesTooltip } from "/snippets/tooltips.jsx"; + When developing for Runpod Serverless, you'll typically start by writing handler functions, test them locally, and then deploy to production. This guide introduces the development workflow and tools that help you test, debug, and optimize your Serverless applications effectively. ## Development lifecycle @@ -116,6 +118,6 @@ Learn more in [Logs and monitoring](/serverless/development/logs) and [Connect t ## Environment variables -Use environment variables to configure your workers without hardcoding credentials or settings in your code. Environment variables are set in the Runpod console and are available to your handler at runtime. +Use to configure your workers without hardcoding credentials or settings in your code. Environment variables are set in the Runpod console and are available to your handler at runtime. Learn more in [Environment variables](/serverless/development/environment-variables). diff --git a/serverless/endpoints/endpoint-configurations.mdx b/serverless/endpoints/endpoint-configurations.mdx index 96b3b38e..c820b777 100644 --- a/serverless/endpoints/endpoint-configurations.mdx +++ b/serverless/endpoints/endpoint-configurations.mdx @@ -5,6 +5,7 @@ description: "Reference guide for all Serverless endpoint settings and parameter --- import GPUTable from '/snippets/serverless-gpu-pricing-table.mdx'; +import { MachinesTooltip } from "/snippets/tooltips.jsx"; This guide details the configuration options available for Runpod Serverless endpoints. These settings control how your endpoint scales, how it utilizes hardware, and how it manages request lifecycles. @@ -97,7 +98,7 @@ FlashBoot reduces cold start times by retaining the state of worker resources sh ### Model -The Model field allows you to select from a list of [cached models](/serverless/endpoints/model-caching). When selected, Runpod schedules your workers on host machines that already have these large model files pre-loaded. This significantly reduces the time required to load models during worker initialization. +The Model field allows you to select from a list of [cached models](/serverless/endpoints/model-caching). When selected, Runpod schedules your workers on host that already have these large model files pre-loaded. This significantly reduces the time required to load models during worker initialization. ## Advanced settings @@ -111,7 +112,7 @@ You can restrict your endpoint to specific geographical regions. For maximum rel ### CUDA version selection -This filter ensures your workers are scheduled on host machines with compatible drivers. While you should select the version your code requires, we recommend also selecting all newer versions. CUDA is generally backward compatible, and selecting a wider range of versions increases the pool of available hardware. +This filter ensures your workers are scheduled on host with compatible drivers. While you should select the version your code requires, we recommend also selecting all newer versions. CUDA is generally backward compatible, and selecting a wider range of versions increases the pool of available hardware. ### Expose HTTP/TCP ports diff --git a/serverless/endpoints/job-states.mdx b/serverless/endpoints/job-states.mdx index c0c0e499..4e607fa3 100644 --- a/serverless/endpoints/job-states.mdx +++ b/serverless/endpoints/job-states.mdx @@ -3,11 +3,13 @@ title: "Job states and metrics" description: "Monitor your endpoints effectively by understanding job states and key metrics." --- -Understanding job states and metrics is essential for effectively managing your Serverless endpoints. This documentation covers the different states your jobs can be in and the key metrics available to monitor endpoint performance and health. +import { JobTooltip, RequestsTooltip, WorkerTooltip } from "/snippets/tooltips.jsx"; + +Understanding states and metrics is essential for effectively managing your Serverless endpoints. This documentation covers the different states your jobs can be in and the key metrics available to monitor endpoint performance and health. ## Request job states -Understanding job states helps you track the progress of individual requests and identify where potential issues might occur in your workflow. +Understanding job states helps you track the progress of individual and identify where potential issues might occur in your workflow. * `IN_QUEUE`: The job is waiting in the endpoint queue for an available worker to process it. * `RUNNING`: A worker has picked up the job and is actively processing it. diff --git a/serverless/endpoints/model-caching.mdx b/serverless/endpoints/model-caching.mdx index 93eb94d7..1eb2b553 100644 --- a/serverless/endpoints/model-caching.mdx +++ b/serverless/endpoints/model-caching.mdx @@ -5,11 +5,13 @@ description: "Accelerate worker cold starts and reduce costs by using cached mod tag: "NEW" --- +import { MachineTooltip, MachinesTooltip, ColdStartTooltip, WorkersTooltip, HandlerFunctionTooltip } from "/snippets/tooltips.jsx"; + For a step-by-step example showing how to integrate cached models with custom workers, see [Deploy a cached model](/tutorials/serverless/model-caching-text). -Enabling cached models for your workers can reduce [cold start times](/serverless/overview#cold-starts) to just a few seconds and dramatically reduce the cost for loading large models. +Enabling cached models on your endpoints can reduce times and dramatically reduce the cost for loading large models. ## Why use cached models? @@ -17,7 +19,7 @@ Enabling cached models for your workers can reduce [cold start times](/serverles - **Reduced costs:** You aren't billed for worker time while your model is being downloaded. This is especially impactful for large models that can take several minutes to load. - **Accelerated deployment:** You can deploy cached models instantly without waiting for external downloads or transfers. - **Smaller container images:** By decoupling models from your container image, you can create smaller, more focused images that contain only your application logic. -- **Shared across workers:** Multiple workers running on the same host machine can reference the same cached model, eliminating redundant downloads and saving disk space. +- **Shared across workers:** Multiple running on the same host can reference the same cached model, eliminating redundant downloads and saving disk space. ## Cached model compatibility @@ -37,7 +39,7 @@ Cached models aren't suitable if your model is private and not hosted on Hugging When you select a cached model for your endpoint, Runpod automatically tries to start your workers on hosts that already contain the selected model. -If no cached host machines are available, the system delays starting your workers until the model is downloaded onto the machine where your workers will run, ensuring you still won't be charged for the download time. +If no cached host are available, the system delays starting your workers until the model is downloaded onto the machine where your workers will run, ensuring you still won't be charged for the download time.
```mermaid @@ -122,21 +124,28 @@ Cached models are available to your workers at `/runpod-volume/huggingface-cache While cached models use the same mount path as network volumes (`/runpod-volume/`), the model loaded from the cache will load significantly faster than the same model loaded from a network volume. -The path structure follows this pattern: - -``` -/runpod-volume/huggingface-cache/hub/models--HF_ORGANIZATION--MODEL_NAME/snapshots/VERSION_HASH/ -``` - -For example, the model `gensyn/qwen2.5-0.5b-instruct` would be stored at: - -``` -/runpod-volume/huggingface-cache/hub/models--gensyn--qwen2.5-0.5b-instruct/snapshots/317b7eb96312eda0c431d1dab1af958a308cb35e/ -``` +For example, here is how the model `gensyn/qwen2.5-0.5b-instruct` would be stored: + + + + + + + + + + + + + + + + + ### Programmatically locate cached models -To dynamically locate cached models without hardcoding paths, you can add this helper function to your [handler file](/serverless/workers/handler-functions) to scan the cache directory for the model you want to use: +To dynamically locate cached models without hardcoding paths, you can add this helper function to your to scan the cache directory for the model you want to use: ```python handler.py import os diff --git a/serverless/endpoints/overview.mdx b/serverless/endpoints/overview.mdx index 7e624268..72805623 100644 --- a/serverless/endpoints/overview.mdx +++ b/serverless/endpoints/overview.mdx @@ -4,6 +4,8 @@ sidebarTitle: "Overview" description: "Deploy and manage Serverless endpoints using the Runpod console or REST API." --- +import { QueueBasedEndpointsTooltip, LoadBalancingEndpointsTooltip, ServerlessEnvironmentVariablesTooltip } from "/snippets/tooltips.jsx"; + Endpoints are the foundation of Runpod Serverless, serving as the gateway for deploying and managing your [Serverless workers](/serverless/workers/overview). They provide a consistent API interface that allows your applications to interact with powerful compute resources on demand. Endpoints are RESTful APIs that accept [HTTP requests](/serverless/endpoints/send-requests), processing the input using your [handler function](/serverless/workers/handler-functions), and returning the result via HTTP response. Each endpoint provides a unique URL and abstracts away the complexity of managing individual GPUs/CPUs. diff --git a/serverless/endpoints/send-requests.mdx b/serverless/endpoints/send-requests.mdx index 8c5f3a3e..f9ee9e33 100644 --- a/serverless/endpoints/send-requests.mdx +++ b/serverless/endpoints/send-requests.mdx @@ -4,7 +4,7 @@ sidebarTitle: "Send API requests" description: "Submit and manage jobs for your queue-based endpoints by sending HTTP requests." --- - +import { JobTooltip, JobsTooltip, RequestsTooltip, WorkersTooltip, HandlerFunctionTooltip, QueueBasedEndpointsTooltip, LoadBalancingEndpointTooltip } from "/snippets/tooltips.jsx"; After creating a [Severless endpoint](/serverless/endpoints/overview), you can start sending it HTTP requests (using `cURL` or the Runpod SDK) to submit jobs and retrieve results: @@ -17,12 +17,10 @@ curl -x POST https://api.runpod.ai/v2/ENDPOINT_ID/run \ This page covers everything from basic input structure and job submission, to advanced options, rate limits, and best practices for queue-based endpoints. - -This guide is for **queue-based endpoints**. If you're building a [load balancing endpoint](/serverless/load-balancing/overview), the request structure and endpoints will depend on how you define your HTTP servers. +This guide is for . If you're building a , the request structure and endpoints will depend on how you define your HTTP servers. -