diff --git a/fine-tune.mdx b/fine-tune.mdx
index 50045335..4bff8bb0 100644
--- a/fine-tune.mdx
+++ b/fine-tune.mdx
@@ -88,9 +88,17 @@ For a list of working configuration examples, check out the [Axolotl examples re
Your training environment is located in the `/workspace/fine-tuning/` directory and has the following structure:
-* `examples/`: Sample configurations and scripts.
-* `outputs/`: Where your training results and model outputs will be saved.
-* `config.yaml`: The main configuration file for your training parameters.
+
+
+
+
+
+
+
+
+
+
+`/examples/` contains sample configurations and scripts, `/outputs/` contains your training results and model outputs, and `/config.yaml/` is the main configuration file for your training parameters.
The system generates an initial `config.yaml` based on your selected base model and dataset. This is where you define all the hyperparameters for your fine-tuning job. You may need to experiment with these settings to achieve the best results.
diff --git a/get-started.mdx b/get-started.mdx
index 7fa03c03..0c2f6054 100644
--- a/get-started.mdx
+++ b/get-started.mdx
@@ -4,7 +4,9 @@ sidebarTitle: "Quickstart"
description: "Run code on a remote GPU in minutes."
---
-Follow this guide to learn how to create an account, deploy your first GPU Pod, and use it to execute code remotely.
+import { PodTooltip, NetworkVolumeTooltip, TemplateTooltip } from "/snippets/tooltips.jsx";
+
+Follow this guide to learn how to create an account, deploy your first GPU , and use it to execute code remotely.
## Step 1: Create an account
@@ -46,7 +48,7 @@ Take a minute to explore the other tabs:
- **Details**: Information about your Pod, such as hardware specs, pricing, and storage.
- **Telemetry**: Realtime utilization metrics for your Pod's CPU, memory, and storage.
- **Logs**: Logs streamed from your container (including stdout from any applications inside) and the Pod management system.
-- **Template Readme**: Details about the template your Pod is running. Your Pod is configured with the latest official Runpod PyTorch template.
+- **Template Readme**: Details about the your Pod is running. Your Pod is configured with the latest official Runpod PyTorch template.
## Step 4: Execute code on your Pod with JupyterLab
@@ -55,7 +57,9 @@ Take a minute to explore the other tabs:
3. Type `print("Hello, world!")` in the first line of the notebook.
4. Click the play button to run your code.
-And that's it—congrats! You just ran your first line of code on Runpod.
+
+Congratulations! You just ran your first line of code on Runpod.
+
## Step 5: Clean up
@@ -74,7 +78,7 @@ To terminate your Pod:
-Terminating a Pod permanently deletes all data that isn't stored in a [network volume](/storage/network-volumes). Be sure that you've saved any data you might need to access again.
+Terminating a Pod permanently deletes all data that isn't stored in a . Be sure that you've saved any data you might need to access again.
To learn more about how storage works, see the [Pod storage overview](/pods/storage/types).
diff --git a/get-started/api-keys.mdx b/get-started/api-keys.mdx
index 4d2c39ee..8cf82ae5 100644
--- a/get-started/api-keys.mdx
+++ b/get-started/api-keys.mdx
@@ -3,6 +3,8 @@ title: "Manage API keys"
description: "Learn how to create, edit, and disable Runpod API keys."
---
+import { ServerlessTooltip } from "/snippets/tooltips.jsx";
+
Legacy API keys generated before November 11, 2024 have either Read/Write or Read Only access to GraphQL based on what was set for that key. All legacy keys have full access to AI API. To improve security, generate a new key with **Restricted** permission and select the minimum permission needed for your use case.
@@ -20,7 +22,7 @@ Follow these steps to create a new Runpod API key:
3. Give your key a name and set its permissions (**All**, **Restricted**, or **Read Only**). If you choose **Restricted**, you can customize access for each Runpod API:
* **None**: No access
- * **Restricted**: Customize access for each of your [Serverless endpoints](/serverless/overview). (Default: None.)
+ * **Restricted**: Customize access for each of your endpoints. (Default: None.)
* **Read/Write**: Full access to your endpoints.
* **Read Only**: Read access without write access.
diff --git a/get-started/concepts.mdx b/get-started/concepts.mdx
index a498d051..0e81da43 100644
--- a/get-started/concepts.mdx
+++ b/get-started/concepts.mdx
@@ -3,6 +3,8 @@ title: "Concepts"
description: "Key concepts and terminology for understanding Runpod's platform and products."
---
+import { PodsTooltip, ServerlessTooltip } from "/snippets/tooltips.jsx";
+
## [Runpod console](https://console.runpod.io)
The web interface for managing your compute resources, account, teams, and billing.
@@ -25,7 +27,7 @@ A managed compute cluster with high-speed networking for multi-node distributed
## [Network volume](/storage/network-volumes)
-Persistent storage that exists independently of your other compute resources and can be attached to multiple Pods or Serverless endpoints to share data between machines.
+Persistent storage that exists independently of your other compute resources and can be attached to multiple or endpoints to share data between machines.
## [S3-compatible API](/storage/s3-api)
diff --git a/get-started/connect-to-runpod.mdx b/get-started/connect-to-runpod.mdx
index a188cdc0..aa292281 100644
--- a/get-started/connect-to-runpod.mdx
+++ b/get-started/connect-to-runpod.mdx
@@ -3,11 +3,13 @@ title: "Choose a workflow"
description: "Review the available methods for accessing and managing Runpod resources."
---
+import { PodsTooltip, EndpointTooltip, ServerlessTooltip } from "/snippets/tooltips.jsx";
+
Runpod offers multiple ways to access and manage your compute resources. Choose the method that best fits your workflow:
## Runpod console
-The Runpod console provides an intuitive web interface to manage Pods and endpoints, access Pod terminals, send endpoint requests, monitor resource usage, and view billing and usage history.
+The Runpod console provides an intuitive web interface to manage and s, access Pod terminals, send endpoint requests, monitor resource usage, and view billing and usage history.
[Launch the Runpod console →](https://www.console.runpod.io)
@@ -19,7 +21,7 @@ You can connect directly to your running Pods and execute code on them using a v
## REST API
-The Runpod REST API allows you to programmatically manage and control compute resources. Use the API to manage Pod lifecycles and Serverless endpoints, monitor resource utilization, and integrate Runpod into your applications.
+The Runpod REST API allows you to programmatically manage and control compute resources. Use the API to manage Pod lifecycles and endpoints, monitor resource utilization, and integrate Runpod into your applications.
[Explore the API reference →](/api-reference/docs/GET/openapi-json)
diff --git a/get-started/manage-accounts.mdx b/get-started/manage-accounts.mdx
index 6e89f0b0..cf876ffa 100644
--- a/get-started/manage-accounts.mdx
+++ b/get-started/manage-accounts.mdx
@@ -3,13 +3,15 @@ title: "Manage accounts"
description: "Create accounts, manage teams, and configure user permissions in Runpod."
---
+import { PodsTooltip, ServerlessTooltip, InstantClusterTooltip, NetworkVolumeTooltip } from "/snippets/tooltips.jsx";
+
To access Runpod resources, you need to either create your own account or join an existing team through an invitation. This guide explains how to set up and manage accounts, teams, and user roles.
## Create an account
Sign up for a Runpod account at [console.runpod.io/signup](https://www.console.runpod.io/signup).
-Once created, you can use your account to deploy Pods, create Serverless endpoints, and access other Runpod services. Personal accounts can be converted to team accounts at any time to enable collaboration features.
+Once created, you can use your account to deploy , create endpoints, and access other Runpod services. Personal accounts can be converted to team accounts at any time to enable collaboration features.
## Convert to a team account
diff --git a/get-started/products.mdx b/get-started/products.mdx
index f2f72d8c..29b78f0c 100644
--- a/get-started/products.mdx
+++ b/get-started/products.mdx
@@ -4,7 +4,9 @@ sidebarTitle: "Product overview"
description: "Explore Runpod's major offerings and find the right solution for your workload."
---
-Runpod offers cloud computing resources for AI and machine learning workloads. You can choose from instant GPUs for development, auto-scaling Serverless computing, pre-deployed AI models, or multi-node clusters for distributed training.
+import { ServerlessTooltip, PodsTooltip, PublicEndpointTooltip, InstantClusterTooltip, WorkerTooltip } from "/snippets/tooltips.jsx";
+
+Runpod offers cloud computing resources for AI and machine learning workloads. You can choose from instant GPUs for development, auto-scaling computing, pre-deployed AI models, or multi-node clusters for distributed training.
## [Serverless](/serverless/overview)
@@ -12,15 +14,15 @@ Serverless provides pay-per-second computing with automatic scaling for producti
## [Pods](/pods/overview)
-Pods give you dedicated GPU or CPU instances for containerized workloads. Pods are billed by the minute and stay available as long as you keep them running, making them perfect for development, training, and workloads that need continuous access.
+ give you dedicated GPU or CPU instances for containerized workloads. Pods are billed by the minute and stay available as long as you keep them running, making them perfect for development, training, and workloads that need continuous access.
## [Public Endpoints](/hub/public-endpoints)
-Public Endpoints provide instant API access to pre-deployed AI models for image, video, and text generation without any setup. You only pay for what you generate, making it easy to integrate AI into your applications without managing infrastructure.
+s provide instant API access to pre-deployed AI models for image, video, and text generation without any setup. You only pay for what you generate, making it easy to integrate AI into your applications without managing infrastructure.
## [Instant Clusters](/instant-clusters)
-Instant Clusters deliver fully managed multi-node compute clusters for large-scale distributed workloads. With high-speed networking between nodes, you can run multi-node training, fine-tune large language models, and handle other tasks that require multiple GPUs working in parallel.
+s deliver fully managed multi-node compute clusters for large-scale distributed workloads. With high-speed networking between nodes, you can run multi-node training, fine-tune large language models, and handle other tasks that require multiple GPUs working in parallel.
## Choosing the right option
diff --git a/hub/overview.mdx b/hub/overview.mdx
index 0cfb08bc..1b439e1b 100644
--- a/hub/overview.mdx
+++ b/hub/overview.mdx
@@ -4,7 +4,9 @@ sidebarTitle: "Overview"
description: "Discover, deploy, and share preconfigured AI repos using the Runpod Hub."
---
-The [Runpod Hub](https://console.runpod.io/hub) is a centralized repository that enables users to discover, share, and deploy preconfigured AI repos optimized for Runpod's [Serverless](/serverless/overview/) and [Pod](/pods/overview) infrastructure. It offers a catalog of vetted, open-source repositories that can be deployed with minimal setup, creating a collaborative ecosystem for AI developers and users.
+import { ServerlessTooltip, PodTooltip, EndpointTooltip, PublicEndpointTooltip, HandlerFunctionTooltip, WorkerTooltip } from "/snippets/tooltips.jsx";
+
+The [Runpod Hub](https://console.runpod.io/hub) is a centralized repository that enables users to discover, share, and deploy preconfigured AI repos optimized for Runpod's and infrastructure. It offers a catalog of vetted, open-source repositories that can be deployed with minimal setup, creating a collaborative ecosystem for AI developers and users.
Whether you're a developer looking to share your work or a user seeking preconfigured solutions, the Hub makes discovering and deploying AI projects seamless and efficient.
@@ -32,7 +34,7 @@ The Hub simplifies the entire lifecycle of repo sharing and deployment, from ini
## Public Endpoints
-In addition to official and community-submitted repos, the Hub also offers [Public Endpoints](/hub/public-endpoints) for popular AI models. These are ready-to-use APIs that you can integrate directly into your applications without needing to manage any of the underlying infrastructure.
+In addition to official and community-submitted repos, the Hub also offers s for popular AI models. These are ready-to-use APIs that you can integrate directly into your applications without needing to manage any of the underlying infrastructure.
Public Endpoints provide:
@@ -63,7 +65,7 @@ You can deploy a repo from the Hub in seconds, choosing between Serverless endpo
4. Click the **Deploy** button in the top-right of the repo page. You can also use the dropdown menu to deploy an older version.
5. Click **Create Endpoint**
-Within minutes you'll have access to a new Serverless endpoint, ready for integration with your applications or experimentation.
+Within minutes you'll have access to a new Serverless , ready for integration with your applications or experimentation.
### Deploy as a Pod
@@ -96,7 +98,7 @@ Where `POD_ID` is your Pod's actual ID.
## Publish your own repo
-You can [publish your own repo](/hub/publishing-guide) on the Hub by preparing your GitHub repository with a working [Serverless endpoint](/serverless/overview) implementation, comprised of a [worker handler function](/serverless/workers/handler-functions) and `Dockerfile`.
+You can [publish your own repo](/hub/publishing-guide) on the Hub by preparing your GitHub repository with a working Serverless endpoint implementation, comprised of a and `Dockerfile`.
To learn how to build your first worker, [follow this guide](/serverless/workers/custom-worker).
diff --git a/instant-clusters.mdx b/instant-clusters.mdx
index b3a7c9f2..8411979c 100644
--- a/instant-clusters.mdx
+++ b/instant-clusters.mdx
@@ -4,6 +4,8 @@ sidebarTitle: "Overview"
description: "Fully managed compute clusters for multi-node training and AI inference."
---
+import { DataCenterTooltip, PyTorchTooltip } from "/snippets/tooltips.jsx";
+
Runpod offers custom Instant Cluster pricing plans for large scale and enterprise workloads. If you're interested in learning more, [contact our sales team](https://ecykq.share.hsforms.com/2MZdZATC3Rb62Dgci7knjbA).
@@ -37,7 +39,7 @@ Instant Clusters feature high-speed local networking for efficient data movement
* Most clusters include 3200 Gbps networking.
* A100 clusters offer up to 1600 Gbps networking.
-This fast networking enables efficient scaling of distributed training and inference workloads. Runpod ensures nodes selected for clusters are within the same data center for optimal performance.
+This fast networking enables efficient scaling of distributed training and inference workloads. Runpod ensures nodes selected for clusters are within the same for optimal performance.
## Zero configuration
@@ -45,7 +47,7 @@ Runpod automates cluster setup so you can focus on your workloads:
* Clusters are pre-configured with static IP address management.
* All necessary [environment variables](#environment-variables) for distributed training are pre-configured.
-* Supports popular frameworks like PyTorch, TensorFlow, and Slurm.
+* Supports popular frameworks like , TensorFlow, and Slurm.
## Get started
diff --git a/instant-clusters/axolotl.mdx b/instant-clusters/axolotl.mdx
index a4075e24..89f89ea5 100644
--- a/instant-clusters/axolotl.mdx
+++ b/instant-clusters/axolotl.mdx
@@ -86,7 +86,9 @@ After running the command on the last Pod, you should see output similar to this
[2025-04-01 19:24:22,603] [INFO] [axolotl.train.save_trained_model:211] [PID:1009] [RANK:0] Training completed! Saving pre-trained model to ./outputs/lora-out.
```
-Congrats! You've successfully trained a model using Axolotl on an Instant Cluster. Your fine-tuned model has been saved to the `./outputs/lora-out` directory. You can now use this model for inference or continue training with different parameters.
+
+Congratulations! You've successfully trained a model using Axolotl on an Instant Cluster. Your fine-tuned model has been saved to the `./outputs/lora-out` directory. You can now use this model for inference or continue training with different parameters.
+
## Step 4: Clean up
diff --git a/instant-clusters/pytorch.mdx b/instant-clusters/pytorch.mdx
index 3b460092..ad50b01d 100644
--- a/instant-clusters/pytorch.mdx
+++ b/instant-clusters/pytorch.mdx
@@ -3,7 +3,9 @@ title: "Deploy an Instant Cluster with PyTorch"
sidebarTitle: "PyTorch"
---
-This tutorial demonstrates how to use Instant Clusters with [PyTorch](http://pytorch.org) to run distributed workloads across multiple GPUs. By leveraging PyTorch's distributed processing capabilities and Runpod's high-speed networking infrastructure, you can significantly accelerate your training process compared to single-GPU setups.
+import { PyTorchTooltip } from "/snippets/tooltips.jsx";
+
+This tutorial demonstrates how to use Instant Clusters with to run distributed workloads across multiple GPUs. By leveraging PyTorch's distributed processing capabilities and Runpod's high-speed networking infrastructure, you can significantly accelerate your training process compared to single-GPU setups.
Follow the steps below to deploy a cluster and start running distributed PyTorch workloads efficiently.
diff --git a/pods/choose-a-pod.mdx b/pods/choose-a-pod.mdx
index 02b8c47f..66db930f 100644
--- a/pods/choose-a-pod.mdx
+++ b/pods/choose-a-pod.mdx
@@ -4,6 +4,8 @@ description: "Select the right Pod by evaluating your resource requirements."
sidebar_position: 3
---
+import { CUDATooltip } from "/snippets/tooltips.jsx";
+
Selecting the appropriate Pod configuration is a crucial step in maximizing performance and efficiency for your specific workloads. This guide will help you understand the key factors to consider when choosing a Pod that meets your requirements.
## Understanding your workload needs
@@ -28,7 +30,7 @@ There are several online tools that can help you estimate your resource requirem
### GPU selection
-The GPU is the cornerstone of computational performance for many workloads. When selecting your GPU, consider the architecture that best suits your software requirements. NVIDIA GPUs with CUDA support are essential for most machine learning frameworks, while some applications might perform better on specific GPU generations. Evaluate both the raw computing power (CUDA cores, tensor cores) and the memory bandwidth to ensure optimal performance for your specific tasks.
+The GPU is the cornerstone of computational performance for many workloads. When selecting your GPU, consider the architecture that best suits your software requirements. NVIDIA GPUs with support are essential for most machine learning frameworks, while some applications might perform better on specific GPU generations. Evaluate both the raw computing power (CUDA cores, tensor cores) and the memory bandwidth to ensure optimal performance for your specific tasks.
For machine learning inference, a mid-range GPU might be sufficient, while training large models requires more powerful options. Check framework-specific recommendations, as PyTorch, TensorFlow, and other frameworks may perform differently across GPU types.
diff --git a/pods/manage-pods.mdx b/pods/manage-pods.mdx
index cf349f88..048e5420 100644
--- a/pods/manage-pods.mdx
+++ b/pods/manage-pods.mdx
@@ -3,6 +3,8 @@ title: "Manage Pods"
description: "Create, start, stop, and terminate Pods using the Runpod console or CLI."
---
+import { MachineTooltip, TemplatesTooltip } from "/snippets/tooltips.jsx";
+
## Before you begin
If you want to manage Pods using the Runpod CLI, you'll need to [install Runpod CLI](/runpodctl/overview), and set your [API key](/get-started/api-keys) in the configuration.
@@ -39,7 +41,7 @@ GPU configuration:
**CUDA Version Compatibility**
-When using templates (especially community templates like `runpod/pytorch:2.8.0-py3.11-cuda12.8.1-cudnn-devel-ubuntu22.04`), ensure the host machine's CUDA driver version matches or exceeds the template's requirements.
+When using (especially community templates like `runpod/pytorch:2.8.0-py3.11-cuda12.8.1-cudnn-devel-ubuntu22.04`), ensure the CUDA version of the host matches or exceeds the template's requirements.
If you encounter errors like "OCI runtime create failed" or "unsatisfied condition: cuda>=X.X", you need to filter for compatible machines:
@@ -259,11 +261,11 @@ pod "wu5ekmn69oh1xr" started with $0.290 / hr
## Terminate a Pod
-
+
Terminating a Pod permanently deletes all associated data that isn't stored in a [network volume](/storage/network-volumes). Be sure to export or download any data that you'll need to access again.
-
+
diff --git a/pods/overview.mdx b/pods/overview.mdx
index c20b0e9f..74a85039 100644
--- a/pods/overview.mdx
+++ b/pods/overview.mdx
@@ -3,6 +3,8 @@ title: Overview
description: "Get on-demand access to powerful computing resources."
---
+import { NetworkVolumeTooltip, ContainerDiskTooltip, VolumeDiskTooltip, ServerlessTooltip, RunpodHubTooltip, GlobalNetworkingTooltip, RunpodCLITooltip, TemplatesTooltip } from "/snippets/tooltips.jsx";
+
@@ -28,7 +30,7 @@ Each Pod consists of these core components:
## Pod templates
-[Pod templates](/pods/templates/overview) are pre-configured Docker image setups that let you quickly spin up Pods without manual environment configuration. They're essentially deployment configurations that include specific models, frameworks, or workflows bundled together.
+Pod are pre-configured Docker image setups that let you quickly spin up Pods without manual environment configuration. They're essentially deployment configurations that include specific models, frameworks, or workflows bundled together.
Templates eliminate the need to manually set up environments, saving time and reducing configuration errors. For example, instead of installing PyTorch, configuring JupyterLab, and setting up all dependencies yourself, you can select an official Runpod PyTorch template and have everything ready to go instantly.
@@ -38,11 +40,11 @@ To learn how to create your own custom templates, see [Build a custom Pod templa
Pods offer three types of storage to match different use cases:
-Every Pod comes with a resizable **container disk** that houses the operating system and stores temporary files, which are cleared after the Pod stops.
+Every Pod comes with a resizable that houses the operating system and stores temporary files, which are cleared after the Pod stops.
-**Volume disks** provide persistent storage that is preserved throughout the Pod's lease, functioning like a dedicated hard drive. Data stored in the volume disk directory (`/workspace` by default) persists when you stop the Pod, but is erased when the Pod is deleted.
+By contrast, s provide persistent storage that is preserved throughout the Pod's lease, functioning like a dedicated hard drive. Data stored in the volume disk directory (`/workspace` by default) persists when you stop the Pod, but is erased when the Pod is deleted.
-Optional [network volumes](/storage/network-volumes) provide more flexible permanent storage that can be transferred between Pods, replacing the volume disk when attached. When using a Pod with network volume attached, you can safely delete your Pod without losing the data stored in your network volume directory (`/workspace` by default).
+Optional s provide more flexible permanent storage that can be transferred between Pods, replacing the volume disk when attached. When using a Pod with network volume attached, you can safely delete your Pod without losing the data stored in your network volume directory (`/workspace` by default).
To learn more, see [Storage options](/pods/storage/types).
@@ -53,7 +55,13 @@ You can deploy Pods in several ways:
- [From a template](/pods/templates/overview): Pre-configured environments for quick setup of common workflows.
- **Custom containers**: Pull from any compatible container registry such as Docker Hub, GitHub Container Registry, or Amazon ECR.
- **Custom images**: Build and deploy your own container images.
-- [From Serverless repos](/hub/overview#deploy-as-a-pod): Deploy any Serverless-compatible repository from the [Runpod Hub](/hub/overview) directly as a Pod, providing a cost-effective option for consistent workloads.
+- [From Serverless repos](/hub/overview#deploy-as-a-pod): Deploy any -compatible repository from the directly as a Pod, providing a cost-effective option for consistent workloads.
+
+
+
+When building a container image for Runpod on a Mac (Apple Silicon), use the flag `--platform linux/amd64` to ensure your image is compatible with the platform.
+
+
## Connecting to your Pod
@@ -119,3 +127,4 @@ Ready to get started? Explore these pages to learn more:
* Configure [global networking](/pods/networking) for your applications.
* [Set up Ollama on a Pod](/tutorials/pods/run-ollama) to run LLM inference with HTTP API access.
* [Build Docker images with Bazel](/tutorials/pods/build-docker-images) to emulate a Docker-in-Docker workflow.
+
diff --git a/pods/pricing.mdx b/pods/pricing.mdx
index 77bfcd78..bbfc9b65 100644
--- a/pods/pricing.mdx
+++ b/pods/pricing.mdx
@@ -4,6 +4,8 @@ sidebarTitle: "Pricing"
description: "Explore pricing options for Pods, including on-demand, savings plans, and spot instances."
---
+import { MachineTooltip } from "/snippets/tooltips.jsx";
+
Runpod offers custom pricing plans for large scale and enterprise workloads. If you're interested in learning more, [contact our sales team](https://ecykq.share.hsforms.com/2MZdZATC3Rb62Dgci7knjbA).
@@ -148,7 +150,7 @@ Runpod offers [three types of storage](/pods/storage/types) for Pods::
- **Disk volumes:** Persistent storage that is billed at \$0.10 per GB per month on running Pods and \$0.20 per GB per month for volume storage on stopped Pods. Billed per-second.
- **Network volumes:** External storage that is billed at \$0.07 per GB per month for storage requirements below 1TB. For requirements exceeding 1TB, the rate is \$0.05 per GB per month. Billed hourly.
-You are not charged for storage if the host machine is down or unavailable from the public internet.
+You are not charged for storage if the host is down or unavailable from the public internet.
Container and volume disk storage will be included in your Pod's displayed hourly cost during deployment.
diff --git a/pods/storage/types.mdx b/pods/storage/types.mdx
index ce5f97f3..218fcdcb 100644
--- a/pods/storage/types.mdx
+++ b/pods/storage/types.mdx
@@ -3,13 +3,15 @@ title: "Storage options"
description: "Choose the right type of storage for your Pods."
---
-Choosing the right type of storage is crucial for optimizing your workloads, whether you need temporary storage for active computations, persistent storage for long-term data retention, or permanent, shareable storage across multiple Pods.
+import { PodsTooltip, ContainerDiskTooltip, VolumeDiskTooltip, NetworkVolumeTooltip, PodTooltip } from "/snippets/tooltips.jsx";
+
+Choosing the right type of storage is crucial for optimizing your workloads, whether you need temporary storage for active computations, persistent storage for long-term data retention, or permanent, shareable storage across multiple .
This page describes the different types of storage options available for your Pods, and when to use each in your workflow.
## Container disk
-A container disk houses the operating system and provides temporary storage for a Pod. It's created when a Pod is launched and is directly tied to the Pod's lifecycle.
+A container disk houses the operating system and provides temporary storage for a . It's created when a Pod is launched and is directly tied to the Pod's lifecycle.
## Volume disk
@@ -19,9 +21,9 @@ The volume disk is mounted at `/workspace` by default (this will be replaced by
## Network volume
-[Network volumes](/storage/network-volumes) offer persistent storage similar to the volume disk, but with the added benefit that they can be attached to multiple Pods, and that they persist independently from the Pod's lifecycle. This allows you to share and access data across multiple instances or transfer storage between machines, and retain data even after a Pod is deleted.
-
-When attached to a Pod, a network volume replaces the volume disk, and by default they are similarly mounted at `/workspace`.
+[Network volumes](/storage/network-volumes) offer persistent storage that can be attached to multiple Pods and persists independently from the Pod's lifecycle. This allows you to share and access data across multiple instances or transfer storage between machines, and retain data even after a Pod is deleted.
+`
+When attached to a Pod, a network volume replaces the volume disk, and by default it is mounted at `/workspace`.
diff --git a/pods/templates/create-custom-template.mdx b/pods/templates/create-custom-template.mdx
index 4276f31e..dfbf96e9 100644
--- a/pods/templates/create-custom-template.mdx
+++ b/pods/templates/create-custom-template.mdx
@@ -5,11 +5,13 @@ description: "A step-by-step guide to extending Runpod's official templates."
tag: "NEW"
---
+import { PodTooltip, PodsTooltip, PyTorchTooltip, CUDATooltip, TemplateTooltip } from "/snippets/tooltips.jsx";
+
You can find the complete code for this tutorial, including automated build options with GitHub Actions, in the [runpod-workers/pod-template](https://github.com/runpod-workers/pod-template) repository.
-This tutorial shows how to build a custom Pod template from the ground up. You'll extend an official Runpod template, add your own dependencies, configure how your container starts, and pre-load machine learning models. This approach saves time during Pod initialization and ensures consistent environments across deployments.
+This tutorial shows how to build a custom from the ground up. You'll extend an official Runpod template, add your own dependencies, configure how your container starts, and pre-load machine learning models. This approach saves time during Pod initialization and ensures consistent environments across deployments.
By creating custom templates, you can package everything your project needs into a reusable Docker image. Once built, you can deploy your workload in seconds instead of reinstalling dependencies every time you start a new Pod. You can also share your template with members of your team and the wider Runpod community.
@@ -57,18 +59,19 @@ touch Dockerfile requirements.txt main.py
Your project structure should now look like this:
-```
-my-custom-pod-template/
-├── Dockerfile
-├── requirements.txt
-└── main.py
-```
+
+
+
+
+
+
+
## Step 2: Choose a base image and create your Dockerfile
-Runpod offers base images with PyTorch, CUDA, and common dependencies pre-installed. You'll extend one of these images to build your custom template.
+Runpod offers base images with , , and common dependencies pre-installed. You'll extend one of these images to build your custom template.
@@ -515,7 +518,9 @@ To avoid incurring unnecessary charges, make sure to stop and then terminate you
## Next steps
+
Congratulations! You've built a custom Pod template and deployed it to Runpod.
+
You can use this as a jumping off point to build your own custom templates with your own applications, dependencies, and models.
diff --git a/pods/templates/environment-variables.mdx b/pods/templates/environment-variables.mdx
index d5c3f191..ec0a53bf 100644
--- a/pods/templates/environment-variables.mdx
+++ b/pods/templates/environment-variables.mdx
@@ -3,7 +3,9 @@ title: "Environment variables"
description: "Learn how to use environment variables in Runpod Pods for configuration, security, and automation"
---
-Environment variables in are key-value pairs that you can configure for your Pods. They are accessible within your containerized application and provide a flexible way to pass configuration settings, secrets, and runtime information to your application without hardcoding them into your code or container image.
+import { PodTooltip, PodsTooltip } from "/snippets/tooltips.jsx";
+
+Environment variables are key-value pairs that you can configure for your . They are accessible within your containerized application and provide a flexible way to pass configuration settings, secrets, and runtime information to your application without hardcoding them into your code or container image.
## What are environment variables?
diff --git a/pods/templates/manage-templates.mdx b/pods/templates/manage-templates.mdx
index de6e2459..d723f668 100644
--- a/pods/templates/manage-templates.mdx
+++ b/pods/templates/manage-templates.mdx
@@ -3,7 +3,9 @@ title: "Manage Pod templates"
description: "Learn how to create, and manage custom Pod templates."
---
-Creating a custom template allows you to package your specific configuration for reuse and sharing. Templates define all the necessary components to launch a Pod with your desired setup.
+import { PodTooltip, PodEnvironmentVariablesTooltip } from "/snippets/tooltips.jsx";
+
+Creating a custom template allows you to package your specific configuration for reuse and sharing. Templates define all the necessary components to launch a with your desired setup.
## Template configuration options
@@ -102,7 +104,7 @@ For more details, see the [API reference](/api-reference/templates/POST/template
## Using environment variables in templates
-Environment variables provide a flexible way to configure your Pod's runtime behavior without modifying the container image.
+ provide a flexible way to configure your Pod's runtime behavior without modifying the container image.
### Defining environment variables
diff --git a/pods/templates/overview.mdx b/pods/templates/overview.mdx
index d95e57e9..e71f9d19 100644
--- a/pods/templates/overview.mdx
+++ b/pods/templates/overview.mdx
@@ -3,7 +3,9 @@ title: "Overview"
description: "Streamline your Pod deployments with templates, bundling prebuilt container images with hardware specs and network settings."
---
-Pod templates are pre-configured Docker image setups that let you quickly spin up Pods without manual environment configuration. They're essentially deployment configurations that include specific models, frameworks, or workflows bundled together.
+import { PodTooltip, PodsTooltip, PodEnvironmentVariablesTooltip } from "/snippets/tooltips.jsx";
+
+ templates are pre-configured Docker image setups that let you quickly spin up Pods without manual environment configuration. They're essentially deployment configurations that include specific models, frameworks, or workflows bundled together.
Templates eliminate the need to manually set up environments, saving time and reducing configuration errors. For example, instead of installing PyTorch, configuring JupyterLab, and setting up all dependencies yourself, you can select a pre-configured template and have everything ready to go instantly.
@@ -23,7 +25,7 @@ Pod templates contain all the necessary components to launch a fully configured
- **Container image:** The Docker image with all necessary software packages and dependencies. This is where the core functionality of the template is stored, i.e., the software package and any files associated with it.
- **Hardware specifications:** Container disk size, volume size, and mount paths that define the storage requirements for your Pod.
- **Network settings:** Exposed ports for services like web UIs or APIs. If the image has a server associated with it, you'll want to ensure that the HTTP and TCP ports are exposed as necessary.
-- **Environment variables:** Pre-configured settings specific to the template that customize the behavior of the containerized application.
+- **:** Pre-configured settings specific to the template that customize the behavior of the containerized application.
- **Startup commands:** Instructions that run when the Pod launches, allowing you to customize the initialization process.
## Types of templates
diff --git a/pods/templates/secrets.mdx b/pods/templates/secrets.mdx
index 3ee206c8..2576dae7 100644
--- a/pods/templates/secrets.mdx
+++ b/pods/templates/secrets.mdx
@@ -3,7 +3,9 @@ title: "Manage secrets"
description: "Securely store and manage sensitive information like API keys, passwords, and tokens with Runpod secrets."
---
-This guide shows how to create, view, edit, delete, and use secrets in your [Pod templates](/pods/templates/overview) to protect sensitive data and improve security.
+import { PodTooltip, PodsTooltip, TemplatesTooltip } from "/snippets/tooltips.jsx";
+
+This guide shows how to create, view, edit, delete, and use secrets in your to protect sensitive data and improve security.
## What are Runpod secrets
diff --git a/references/billing-information.mdx b/references/billing-information.mdx
index 6d7569ee..14d4402d 100644
--- a/references/billing-information.mdx
+++ b/references/billing-information.mdx
@@ -3,6 +3,8 @@ title: "Billing information"
description: "Understand how billing works for Pods, storage, network volumes, refunds, and spending limits."
---
+import { MachineTooltip } from "/snippets/tooltips.jsx";
+
All billing, including per-hour compute and storage billing, is charged per minute.
## How billing works
@@ -19,7 +21,7 @@ You must have at least one hour's worth of runtime in your balance to rent a Pod
Storage billing varies depending on Pod state. Running Pods are charged \$0.10 per GB per month for all storage, while stopped Pods are charged \$0.20 per GB per month for volume storage.
-Storage is charged per minute. You are not charged for storage if the host machine is down or unavailable from the public internet.
+Storage is charged per minute. You are not charged for storage if the host is down or unavailable from the public internet.
## Network volume billing
diff --git a/references/troubleshooting/pod-migration.mdx b/references/troubleshooting/pod-migration.mdx
index 4a5d4953..499a86f6 100644
--- a/references/troubleshooting/pod-migration.mdx
+++ b/references/troubleshooting/pod-migration.mdx
@@ -4,11 +4,13 @@ description: "Automatically migrate your Pod to a new machine when your GPU is u
tag: "BETA"
---
+import { MachineTooltip } from "/snippets/tooltips.jsx";
+
Pod migration is currently in beta. [Join our Discord](https://discord.gg/runpod) if you'd like to provide feedback.
-When you start a Pod, it's assigned to a specific physical machine with 4-8 GPUs. This creates a link between your Pod and that particular machine. As long as your Pod is running, that GPU is exclusively reserved for you, which ensures stable pricing and prevents your work from being interrupted.
+When you start a Pod, it's assigned to a specific physical with 4-8 GPUs. This creates a link between your Pod and that particular machine. As long as your Pod is running, that GPU is exclusively reserved for you, which ensures stable pricing and prevents your work from being interrupted.
When you stop a Pod, you release that specific GPU, allowing other users to rent it. If another user rents the GPU while your Pod is stopped, the GPU will be occupied when you try to restart. Because your Pod is still tied to that original machine, you'll see message asking you to migrate your Pod. This doesn't mean there are no GPUs of that type available on Runpod, just that none are available on the specific physical machine where your Pod's data is stored.
diff --git a/references/troubleshooting/zero-gpus.mdx b/references/troubleshooting/zero-gpus.mdx
index 2b3e2c3a..99a4e221 100644
--- a/references/troubleshooting/zero-gpus.mdx
+++ b/references/troubleshooting/zero-gpus.mdx
@@ -4,7 +4,9 @@ sidebarTitle: "Zero GPU Pods"
description: "What to do when your Pod machine has zero GPUs."
---
-When you restart a stopped Pod, you might see a message telling you that there are "Zero GPU Pods." This is because there are no GPUs available on the machine where your Pod was running.
+import { MachineTooltip } from "/snippets/tooltips.jsx";
+
+When you restart a stopped Pod, you might see a message telling you that there are "Zero GPU Pods." This is because there are no GPUs available on the where your Pod was running.
## Why does this happen?
diff --git a/runpodctl/overview.mdx b/runpodctl/overview.mdx
index 1c7e107b..0b64eb41 100644
--- a/runpodctl/overview.mdx
+++ b/runpodctl/overview.mdx
@@ -4,11 +4,13 @@ sidebarTitle: "Overview"
description: "Use Runpod CLI to manage Pods from your local machine."
---
-Runpod CLI is an [open source](https://github.com/runpod/runpodctl) command-line interface tool for managing your Runpod resources remotely from your local machine. You can transfer files and data between your local system and Runpod, execute code on remote Pods, and automate Pod deployment workflows.
+import { PodsTooltip, PodTooltip } from "/snippets/tooltips.jsx";
+
+Runpod CLI is an [open source](https://github.com/runpod/runpodctl) command-line interface tool for managing your Runpod resources remotely from your local machine. You can transfer files and data between your local system and Runpod, execute code on remote , and automate Pod deployment workflows.
## Install Runpod CLI locally
-Every Pod you deploy comes preinstalled with the `runpodctl` command and a Pod-scoped API key. You can also install it on your local machine to manage your Pods remotely.
+Every you deploy comes preinstalled with the `runpodctl` command and a Pod-scoped API key. You can also install it on your local machine to manage your Pods remotely from your own system.
To install Runpod CLI locally, follow these steps:
diff --git a/serverless/development/dual-mode-worker.mdx b/serverless/development/dual-mode-worker.mdx
index a2a3718f..a22760d5 100644
--- a/serverless/development/dual-mode-worker.mdx
+++ b/serverless/development/dual-mode-worker.mdx
@@ -37,12 +37,16 @@ cd dual-mode-worker
touch handler.py start.sh Dockerfile requirements.txt
```
-This creates:
+This creates the following project structure:
-- `handler.py`: Your Python script with the Runpod handler logic.
-- `start.sh`: A shell script that will be the entrypoint for your Docker container.
-- `Dockerfile`: Instructions to build your Docker image.
-- `requirements.txt`: A file to list Python dependencies.
+
+
+
+
+
+
+
+
## Step 2: Create the handler
@@ -376,7 +380,11 @@ After a few moments for initialization and processing, you should see output sim
## Explore the Pod-first development workflow
-Congratulations! You've successfully built, deployed, and tested a dual-mode Serverless worker. Now, let's explore the recommended iteration process for a Pod-first development workflow:
+
+Congratulations! You've successfully built, deployed, and tested a dual-mode Serverless worker.
+
+
+Now, let's explore the recommended iteration process for a Pod-first development workflow:
diff --git a/serverless/development/optimization.mdx b/serverless/development/optimization.mdx
index ac69817f..a2b8512b 100644
--- a/serverless/development/optimization.mdx
+++ b/serverless/development/optimization.mdx
@@ -4,6 +4,8 @@ sidebarTitle: "Optimization guide"
description: "Implement strategies to reduce latency and cost for your Serverless endpoints."
---
+import { MachineTooltip } from "/snippets/tooltips.jsx";
+
Optimizing your Serverless endpoints involves a cycle of measuring performance with [benchmarking](/serverless/development/benchmarking), identifying bottlenecks, and tuning your [endpoint configurations](/serverless/endpoints/endpoint-configurations). This guide covers specific strategies to reduce startup times and improve throughput.
## Optimization overview
@@ -14,7 +16,7 @@ To ensure high availability during peak traffic, you should select multiple GPU
For latency-sensitive applications, utilizing active workers is the most effective way to eliminate cold starts. You should also configure your [max workers](/serverless/endpoints/endpoint-configurations#max-workers) setting with approximately 20% headroom above your expected concurrency. This buffer ensures that your endpoint can handle sudden load spikes without throttling requests or hitting capacity limits.
-Your architectural choices also significantly impact performance. Whenever possible, bake your models directly into the Docker image to leverage the high-speed local NVMe storage of the host machine. If you utilize [network volumes](/storage/network-volumes) for larger datasets, remember that this restricts your endpoint to specific data centers, which effectively shrinks your pool of available compute resources.
+Your architectural choices also significantly impact performance. Whenever possible, bake your models directly into the Docker image to leverage the high-speed local NVMe storage of the host . If you utilize [network volumes](/storage/network-volumes) for larger datasets, remember that this restricts your endpoint to specific data centers, which effectively shrinks your pool of available compute resources.
## Reducing worker startup times
diff --git a/serverless/development/overview.mdx b/serverless/development/overview.mdx
index 7dee93c5..7dcc8d18 100644
--- a/serverless/development/overview.mdx
+++ b/serverless/development/overview.mdx
@@ -4,6 +4,8 @@ sidebarTitle: "Overview"
description: "Test, debug, and optimize your Serverless applications."
---
+import { ServerlessEnvironmentVariablesTooltip } from "/snippets/tooltips.jsx";
+
When developing for Runpod Serverless, you'll typically start by writing handler functions, test them locally, and then deploy to production. This guide introduces the development workflow and tools that help you test, debug, and optimize your Serverless applications effectively.
## Development lifecycle
@@ -116,6 +118,6 @@ Learn more in [Logs and monitoring](/serverless/development/logs) and [Connect t
## Environment variables
-Use environment variables to configure your workers without hardcoding credentials or settings in your code. Environment variables are set in the Runpod console and are available to your handler at runtime.
+Use to configure your workers without hardcoding credentials or settings in your code. Environment variables are set in the Runpod console and are available to your handler at runtime.
Learn more in [Environment variables](/serverless/development/environment-variables).
diff --git a/serverless/endpoints/endpoint-configurations.mdx b/serverless/endpoints/endpoint-configurations.mdx
index 96b3b38e..c820b777 100644
--- a/serverless/endpoints/endpoint-configurations.mdx
+++ b/serverless/endpoints/endpoint-configurations.mdx
@@ -5,6 +5,7 @@ description: "Reference guide for all Serverless endpoint settings and parameter
---
import GPUTable from '/snippets/serverless-gpu-pricing-table.mdx';
+import { MachinesTooltip } from "/snippets/tooltips.jsx";
This guide details the configuration options available for Runpod Serverless endpoints. These settings control how your endpoint scales, how it utilizes hardware, and how it manages request lifecycles.
@@ -97,7 +98,7 @@ FlashBoot reduces cold start times by retaining the state of worker resources sh
### Model
-The Model field allows you to select from a list of [cached models](/serverless/endpoints/model-caching). When selected, Runpod schedules your workers on host machines that already have these large model files pre-loaded. This significantly reduces the time required to load models during worker initialization.
+The Model field allows you to select from a list of [cached models](/serverless/endpoints/model-caching). When selected, Runpod schedules your workers on host that already have these large model files pre-loaded. This significantly reduces the time required to load models during worker initialization.
## Advanced settings
@@ -111,7 +112,7 @@ You can restrict your endpoint to specific geographical regions. For maximum rel
### CUDA version selection
-This filter ensures your workers are scheduled on host machines with compatible drivers. While you should select the version your code requires, we recommend also selecting all newer versions. CUDA is generally backward compatible, and selecting a wider range of versions increases the pool of available hardware.
+This filter ensures your workers are scheduled on host with compatible drivers. While you should select the version your code requires, we recommend also selecting all newer versions. CUDA is generally backward compatible, and selecting a wider range of versions increases the pool of available hardware.
### Expose HTTP/TCP ports
diff --git a/serverless/endpoints/job-states.mdx b/serverless/endpoints/job-states.mdx
index c0c0e499..4e607fa3 100644
--- a/serverless/endpoints/job-states.mdx
+++ b/serverless/endpoints/job-states.mdx
@@ -3,11 +3,13 @@ title: "Job states and metrics"
description: "Monitor your endpoints effectively by understanding job states and key metrics."
---
-Understanding job states and metrics is essential for effectively managing your Serverless endpoints. This documentation covers the different states your jobs can be in and the key metrics available to monitor endpoint performance and health.
+import { JobTooltip, RequestsTooltip, WorkerTooltip } from "/snippets/tooltips.jsx";
+
+Understanding states and metrics is essential for effectively managing your Serverless endpoints. This documentation covers the different states your jobs can be in and the key metrics available to monitor endpoint performance and health.
## Request job states
-Understanding job states helps you track the progress of individual requests and identify where potential issues might occur in your workflow.
+Understanding job states helps you track the progress of individual and identify where potential issues might occur in your workflow.
* `IN_QUEUE`: The job is waiting in the endpoint queue for an available worker to process it.
* `RUNNING`: A worker has picked up the job and is actively processing it.
diff --git a/serverless/endpoints/model-caching.mdx b/serverless/endpoints/model-caching.mdx
index 93eb94d7..1eb2b553 100644
--- a/serverless/endpoints/model-caching.mdx
+++ b/serverless/endpoints/model-caching.mdx
@@ -5,11 +5,13 @@ description: "Accelerate worker cold starts and reduce costs by using cached mod
tag: "NEW"
---
+import { MachineTooltip, MachinesTooltip, ColdStartTooltip, WorkersTooltip, HandlerFunctionTooltip } from "/snippets/tooltips.jsx";
+
For a step-by-step example showing how to integrate cached models with custom workers, see [Deploy a cached model](/tutorials/serverless/model-caching-text).
-Enabling cached models for your workers can reduce [cold start times](/serverless/overview#cold-starts) to just a few seconds and dramatically reduce the cost for loading large models.
+Enabling cached models on your endpoints can reduce times and dramatically reduce the cost for loading large models.
## Why use cached models?
@@ -17,7 +19,7 @@ Enabling cached models for your workers can reduce [cold start times](/serverles
- **Reduced costs:** You aren't billed for worker time while your model is being downloaded. This is especially impactful for large models that can take several minutes to load.
- **Accelerated deployment:** You can deploy cached models instantly without waiting for external downloads or transfers.
- **Smaller container images:** By decoupling models from your container image, you can create smaller, more focused images that contain only your application logic.
-- **Shared across workers:** Multiple workers running on the same host machine can reference the same cached model, eliminating redundant downloads and saving disk space.
+- **Shared across workers:** Multiple running on the same host can reference the same cached model, eliminating redundant downloads and saving disk space.
## Cached model compatibility
@@ -37,7 +39,7 @@ Cached models aren't suitable if your model is private and not hosted on Hugging
When you select a cached model for your endpoint, Runpod automatically tries to start your workers on hosts that already contain the selected model.
-If no cached host machines are available, the system delays starting your workers until the model is downloaded onto the machine where your workers will run, ensuring you still won't be charged for the download time.
+If no cached host are available, the system delays starting your workers until the model is downloaded onto the machine where your workers will run, ensuring you still won't be charged for the download time.
```mermaid
@@ -122,21 +124,28 @@ Cached models are available to your workers at `/runpod-volume/huggingface-cache
While cached models use the same mount path as network volumes (`/runpod-volume/`), the model loaded from the cache will load significantly faster than the same model loaded from a network volume.
-The path structure follows this pattern:
-
-```
-/runpod-volume/huggingface-cache/hub/models--HF_ORGANIZATION--MODEL_NAME/snapshots/VERSION_HASH/
-```
-
-For example, the model `gensyn/qwen2.5-0.5b-instruct` would be stored at:
-
-```
-/runpod-volume/huggingface-cache/hub/models--gensyn--qwen2.5-0.5b-instruct/snapshots/317b7eb96312eda0c431d1dab1af958a308cb35e/
-```
+For example, here is how the model `gensyn/qwen2.5-0.5b-instruct` would be stored:
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
### Programmatically locate cached models
-To dynamically locate cached models without hardcoding paths, you can add this helper function to your [handler file](/serverless/workers/handler-functions) to scan the cache directory for the model you want to use:
+To dynamically locate cached models without hardcoding paths, you can add this helper function to your to scan the cache directory for the model you want to use:
```python handler.py
import os
diff --git a/serverless/endpoints/overview.mdx b/serverless/endpoints/overview.mdx
index 7e624268..72805623 100644
--- a/serverless/endpoints/overview.mdx
+++ b/serverless/endpoints/overview.mdx
@@ -4,6 +4,8 @@ sidebarTitle: "Overview"
description: "Deploy and manage Serverless endpoints using the Runpod console or REST API."
---
+import { QueueBasedEndpointsTooltip, LoadBalancingEndpointsTooltip, ServerlessEnvironmentVariablesTooltip } from "/snippets/tooltips.jsx";
+
Endpoints are the foundation of Runpod Serverless, serving as the gateway for deploying and managing your [Serverless workers](/serverless/workers/overview). They provide a consistent API interface that allows your applications to interact with powerful compute resources on demand.
Endpoints are RESTful APIs that accept [HTTP requests](/serverless/endpoints/send-requests), processing the input using your [handler function](/serverless/workers/handler-functions), and returning the result via HTTP response. Each endpoint provides a unique URL and abstracts away the complexity of managing individual GPUs/CPUs.
diff --git a/serverless/endpoints/send-requests.mdx b/serverless/endpoints/send-requests.mdx
index 8c5f3a3e..f9ee9e33 100644
--- a/serverless/endpoints/send-requests.mdx
+++ b/serverless/endpoints/send-requests.mdx
@@ -4,7 +4,7 @@ sidebarTitle: "Send API requests"
description: "Submit and manage jobs for your queue-based endpoints by sending HTTP requests."
---
-
+import { JobTooltip, JobsTooltip, RequestsTooltip, WorkersTooltip, HandlerFunctionTooltip, QueueBasedEndpointsTooltip, LoadBalancingEndpointTooltip } from "/snippets/tooltips.jsx";
After creating a [Severless endpoint](/serverless/endpoints/overview), you can start sending it HTTP requests (using `cURL` or the Runpod SDK) to submit jobs and retrieve results:
@@ -17,12 +17,10 @@ curl -x POST https://api.runpod.ai/v2/ENDPOINT_ID/run \
This page covers everything from basic input structure and job submission, to advanced options, rate limits, and best practices for queue-based endpoints.
-
-This guide is for **queue-based endpoints**. If you're building a [load balancing endpoint](/serverless/load-balancing/overview), the request structure and endpoints will depend on how you define your HTTP servers.
+This guide is for . If you're building a , the request structure and endpoints will depend on how you define your HTTP servers.
-
-For faster iteration and debugging of GPU-intensive applications, you can develop on a Pod first before deploying to Serverless. This "Pod-first" workflow gives you direct access to the GPU environment with tools like Jupyter Notebooks and SSH, letting you iterate faster than deploying repeatedly to Serverless. Learn more in [Pod-first development](/serverless/development/dual-mode-worker).
+For faster iteration and debugging of GPU-intensive applications, you can develop on a first before deploying to Serverless. This "Pod-first" workflow gives you direct access to the GPU environment with tools like Jupyter Notebooks and SSH, letting you iterate faster than deploying repeatedly to Serverless. Learn more in [Pod-first development](/serverless/development/dual-mode-worker).
## Rapid deployment options
@@ -237,7 +239,7 @@ vLLM workers may require significant configuration (using environment variables)
**Best for**: Instantly deploying preconfigured AI models.
-You can deploy a Serverless endpoint from a repo in the [Runpod Hub](/hub/overview) in seconds:
+You can deploy a Serverless endpoint from a repo in the in seconds:
1. Navigate to the [Hub page](https://www.console.runpod.io/hub) in the Runpod console.
2. Browse the collection and select a repo that matches your needs.
@@ -251,7 +253,7 @@ You can deploy a Serverless endpoint from a repo in the [Runpod Hub](/hub/overvi
**Best for**: Deploying and serving pre-configured AI models quickly.
-Runpod maintains a collection of [Public Endpoints](/hub/public-endpoints) that you can use to integrate pre-configured AI models into your applications quickly, without writing your own handler function or deploying workers.
+Runpod maintains a collection of s that you can use to integrate pre-configured AI models into your applications quickly, without writing your own or deploying workers.
[Browse Public Endpoints →](https://console.runpod.io/hub?tabSelected=public_endpoints)
diff --git a/serverless/quickstart.mdx b/serverless/quickstart.mdx
index fc256acb..c102f504 100644
--- a/serverless/quickstart.mdx
+++ b/serverless/quickstart.mdx
@@ -234,7 +234,9 @@ When the workers finish processing your request, you should see output on the ri
}
```
+
Congratulations! You've successfully deployed and tested your first Serverless endpoint.
+
## Next steps
diff --git a/serverless/storage/overview.mdx b/serverless/storage/overview.mdx
index 3ac5a6b2..0dbe0aa6 100644
--- a/serverless/storage/overview.mdx
+++ b/serverless/storage/overview.mdx
@@ -4,21 +4,23 @@ sidebarTitle: "Storage options"
description: "Explore storage options for your Serverless workers, including container volumes, network volumes, and S3-compatible storage."
---
-This guide explains the different types of storage you can configure for your Serverless workers so they can access and store data when processing requests.
+import { WorkersTooltip, WorkerTooltip, ContainerVolumeTooltip, NetworkVolumeTooltip, HandlerFunctionTooltip, EndpointTooltip, ColdStartTooltip } from "/snippets/tooltips.jsx";
+
+This guide explains the different types of storage you can configure for your Serverless so they can access and store data when processing requests.
## Storage types
### Container volume
-A worker's container disk holds temporary storage that exists only while a worker is running, and is completely lost when the worker is stopped or scaled down. It's created automatically when a worker launches and remains tightly coupled with the worker's lifecycle.
+A 's holds temporary storage that exists only while a worker is running, and is completely lost when the worker is stopped or scaled down. It's created automatically when a worker launches and remains tightly coupled with the worker's lifecycle.
Container volumes provide fast read and write speeds since they are locally attached to workers. The cost of storage is included in the worker's running cost, making it an economical choice for temporary data.
-Any data saved by a worker's handler function will be stored in the container disk by default. To persist data beyond the current worker session, use a network volume or S3-compatible storage.
+Any data saved by a worker's will be stored in the container disk by default. To persist data beyond the current worker session, use a network volume or S3-compatible storage.
### Network volume
-[Network volumes](/storage/network-volumes) provide persistent storage that can be attached to different workers and even shared between multiple workers. Network volumes are ideal for sharing datasets between workers, storing large models that need to be accessed by multiple workers, and preserving data that needs to outlive any individual worker.
+s provide persistent storage that can be attached to different workers and even shared between multiple workers. Network volumes are ideal for sharing datasets between workers, storing large models that need to be accessed by multiple workers, and preserving data that needs to outlive any individual worker.
To learn how to attach a network volume to your endpoint, see [Network volumes for Serverless](/storage/network-volumes#network-volumes-for-serverless).
@@ -55,7 +57,7 @@ Each worker has its own local directory and maintains its own data. This means t
### Caching and cold starts
-Serverless workers cache and load their Docker images locally on the container disk, even if a network volume is attached. While this local caching speeds up initial worker startup, loading large models into GPU memory can still significantly impact cold start times.
+Serverless workers cache and load their Docker images locally on the container disk, even if a network volume is attached. While this local caching speeds up initial worker startup, loading large models into GPU memory can still significantly impact times.
For guidance on optimizing storage to reduce cold start times, see [Endpoint configuration](/serverless/endpoints/endpoint-configurations#reducing-worker-startup-times).
diff --git a/serverless/vllm/get-started.mdx b/serverless/vllm/get-started.mdx
index af05ca4c..19a98a84 100644
--- a/serverless/vllm/get-started.mdx
+++ b/serverless/vllm/get-started.mdx
@@ -147,7 +147,9 @@ If you encounter issues with your deployment:
## Next steps
+
Congratulations! You've successfully deployed a vLLM worker on Runpod Serverless. You now have a powerful, scalable LLM inference API that's compatible with both the OpenAI client and Runpod's native API.
+
Next you can try:
diff --git a/serverless/workers/create-dockerfile.mdx b/serverless/workers/create-dockerfile.mdx
index ff3ab197..698cb816 100644
--- a/serverless/workers/create-dockerfile.mdx
+++ b/serverless/workers/create-dockerfile.mdx
@@ -3,22 +3,29 @@ title: "Create a Dockerfile"
description: "Package your handler function for deployment."
---
-A Dockerfile defines the build process for a Docker image containing your handler function and all its dependencies. This page explains how to organize your project files and create a Dockerfile for your Serverless worker.
+import { HandlerFunctionTooltip, CUDATooltip } from "/snippets/tooltips.jsx";
+
+A Dockerfile defines the build process for a Docker image containing your and all its dependencies. This page explains how to organize your project files and create a Dockerfile for your Serverless worker.
## Project organization
Organize your project files in a clear directory structure:
-```
-project_directory
-├── Dockerfile # Instructions for building the Docker image
-├── src
-│ └── handler.py # Your handler function
-└── builder
- └── requirements.txt # Dependencies required by your handler
-```
+
+
+
+
+
+
+
+
+
+
+`/Dockerfile/` contains the instructions for building your worker image.
+
+`/src/handler.py/` is your .
-Your `requirements.txt` file should list all Python packages your handler needs:
+`/requirements.txt/` lists the Python dependencies required by your handler. For example:
```txt title="requirements.txt"
# Example requirements.txt
@@ -76,9 +83,9 @@ Include more system tools and libraries but are larger:
FROM python:3.11.1
```
-### CUDA images
+### images
-Required if you need CUDA libraries for GPU-accelerated workloads:
+Required if you need libraries for GPU-accelerated workloads:
```dockerfile
FROM nvidia/cuda:12.1.0-runtime-ubuntu22.04
diff --git a/serverless/workers/handler-functions.mdx b/serverless/workers/handler-functions.mdx
index 61795671..052d9653 100644
--- a/serverless/workers/handler-functions.mdx
+++ b/serverless/workers/handler-functions.mdx
@@ -3,12 +3,12 @@ title: "Overview"
description: "Write custom handler functions to process incoming requests to your queue-based endpoints."
---
+import { JobTooltip, RequestsTooltip, WorkersTooltip, QueueBasedEndpointsTooltip, LoadBalancingEndpointTooltip } from "/snippets/tooltips.jsx";
-Handler functions form the core of your Runpod Serverless applications. They define how your workers process [incoming requests](/serverless/endpoints/send-requests) and return results. This section covers everything you need to know about creating effective handler functions.
-
+Handler functions form the core of your Runpod Serverless applications. They define how your workers process and return results. This section covers everything you need to know about creating effective handler functions.
-Handler functions are only required for **queue-based endpoints**. If you're building a [load balancing endpoint](/serverless/load-balancing/overview), you can define your own custom API endpoints using any HTTP framework of your choice (like FastAPI or Flask).
+Handler functions are only required for . If you're building a , you can define your own custom API endpoints using any HTTP framework of your choice (like FastAPI or Flask).
## Understanding job input
@@ -24,7 +24,7 @@ Before writing a handler function, make sure you understand the structure of the
}
```
-`id` is a unique identifier for the job randomly generated by Runpod, while `input` contains data sent by the client for your handler function to process.
+`id` is a unique identifier for the randomly generated by Runpod, while `input` contains data sent by the client for your handler function to process.
To learn how to structure requests to your endpoint, see [Send API requests](/serverless/endpoints/send-requests).
diff --git a/serverless/workers/overview.mdx b/serverless/workers/overview.mdx
index 2ef0f9fa..afd38dc3 100644
--- a/serverless/workers/overview.mdx
+++ b/serverless/workers/overview.mdx
@@ -3,6 +3,8 @@ title: "Overview"
description: "Package your handler function for deployment."
---
+import { ServerlessEnvironmentVariablesTooltip, MachineTooltip } from "/snippets/tooltips.jsx";
+
Workers are the containerized environments that run your code on Runpod Serverless. After creating and testing your [handler function](/serverless/workers/handler-functions), you need to package it into a Docker image and deploy it to an endpoint.
This page provides an overview of the worker deployment process.
@@ -71,7 +73,7 @@ Workers move through different states as they handle requests and respond to cha
* **Initializing**: The worker starts up while the system downloads and prepares the Docker image. The container starts and loads your code.
* **Idle**: The worker is ready but not processing requests. No charges apply while idle.
* **Running**: The worker actively processes requests. Billing occurs per second.
-* **Throttled**: The worker is ready but temporarily unable to run due to host machine resource constraints.
+* **Throttled**: The worker is ready but temporarily unable to run due to host resource constraints.
* **Outdated**: The system marks the worker for replacement after endpoint updates. It continues processing current jobs during rolling updates (10% of max workers at a time).
* **Unhealthy**: The worker has crashed due to Docker image issues, incorrect start commands, or machine problems. The system automatically retries with exponential backoff for up to 7 days.
diff --git a/snippets/tooltips.jsx b/snippets/tooltips.jsx
new file mode 100644
index 00000000..6b45ab71
--- /dev/null
+++ b/snippets/tooltips.jsx
@@ -0,0 +1,228 @@
+// PODS
+
+export const TemplateTooltip = () => {
+ return (
+ template
+ );
+};
+
+export const TemplatesTooltip = () => {
+ return (
+ templates
+ );
+};
+
+export const PodTooltip = () => {
+ return (
+ Pod
+ );
+};
+
+export const PodsTooltip = () => {
+ return (
+ Pods
+ );
+};
+
+export const GlobalNetworkingTooltip = () => {
+ return (
+ global networking
+ );
+};
+
+// SERVERLESS
+
+export const ServerlessTooltip = () => {
+ return (
+ Serverless
+ );
+};
+
+export const ColdStartTooltip = () => {
+ return (
+ cold start
+ );
+};
+
+export const HandlerFunctionTooltip = () => {
+ return (
+ handler function
+ );
+};
+
+export const RequestTooltip = () => {
+ return (
+ request
+ );
+};
+
+export const RequestsTooltip = () => {
+ return (
+ requests
+ );
+};
+
+export const JobTooltip = () => {
+ return (
+ job
+ );
+};
+
+export const JobsTooltip = () => {
+ return (
+ jobs
+ );
+};
+
+
+export const WorkerTooltip = () => {
+ return (
+ worker
+ );
+};
+
+export const WorkersTooltip = () => {
+ return (
+ workers
+ );
+};
+
+export const EndpointTooltip = () => {
+ return (
+ endpoint
+ );
+};
+
+export const QueueBasedEndpointTooltip = () => {
+ return (
+ queue-based endpoint
+ );
+};
+
+export const QueueBasedEndpointsTooltip = () => {
+ return (
+ queue-based endpoints
+ );
+};
+
+export const LoadBalancingEndpointTooltip = () => {
+ return (
+ load balancing endpoint
+ );
+};
+
+export const LoadBalancingEndpointsTooltip = () => {
+ return (
+ load balancing endpoints
+ );
+};
+
+export const VLLMTooltip = () => {
+ return (
+ vLLM
+ );
+};
+
+export const PyTorchTooltip = () => {
+ return (
+ PyTorch
+ );
+};
+
+export const CUDATooltip = () => {
+ return (
+ CUDA
+ );
+};
+
+export const CachedModelsTooltip = () => {
+ return (
+ cached models
+ );
+};
+
+// STORAGE
+
+export const NetworkVolumeTooltip = () => {
+ return (
+ network volume
+ );
+};
+
+
+export const VolumeDiskTooltip = () => {
+ return (
+ volume disk
+ );
+};
+
+export const ContainerDiskTooltip = () => {
+ return (
+ container disk
+ );
+};
+
+// PRODUCTS
+
+export const RunpodHubTooltip = () => {
+ return (
+ Runpod Hub
+ );
+};
+
+export const PublicEndpointTooltip = () => {
+ return (
+ Public Endpoint
+ );
+};
+
+export const InstantClusterTooltip = () => {
+ return (
+ Instant Cluster
+ );
+};
+
+export const RunpodCLITooltip = () => {
+ return (
+ Runpod CLI
+ );
+};
+
+// CONCEPTS
+
+export const ContainerTooltip = () => {
+ return (
+ container
+ );
+};
+
+export const DataCenterTooltip = () => {
+ return (
+ data center
+ );
+};
+
+export const MachineTooltip = () => {
+ return (
+ machine
+ );
+};
+
+export const MachinesTooltip = () => {
+ return (
+ machines
+ );
+};
+
+export const PodEnvironmentVariablesTooltip = () => {
+ return (
+ environment variables
+ );
+};
+
+export const ServerlessEnvironmentVariablesTooltip = () => {
+ return (
+ environment variables
+ );
+};
\ No newline at end of file
diff --git a/storage/network-volumes.mdx b/storage/network-volumes.mdx
index 2e795082..399b8fcc 100644
--- a/storage/network-volumes.mdx
+++ b/storage/network-volumes.mdx
@@ -3,7 +3,9 @@ title: "Network volumes"
description: "Persistent, portable storage for your AI workloads."
---
-Network volumes offer persistent storage that exists independently of your compute resources. Your data is retained even when your Pods are terminated or your Serverless workers are scaled to zero. You can use them to share data and maintain datasets across multiple machines and [Runpod products](/overview).
+import { PodsTooltip, ServerlessTooltip, WorkersTooltip, WorkerTooltip, EndpointTooltip, HandlerFunctionTooltip, ColdStartTooltip, PodTooltip, InstantClusterTooltip } from "/snippets/tooltips.jsx";
+
+Network volumes offer persistent storage that exists independently of your compute resources. Your data is retained even when your are terminated or your are scaled to zero. You can use them to share data and maintain datasets across multiple machines and [Runpod products](/overview).
Network volumes are backed by high-performance NVMe SSDs connected via high-speed networks. Transfer speeds typically range from 200-400 MB/s, with peak speeds up to 10 GB/s depending on location and network conditions.
@@ -14,7 +16,7 @@ Consider using network volumes when you need:
- **Persistent data that outlives compute resources**: Your data remains accessible even after Pods are terminated or Serverless workers stop.
- **Shareable storage**: Share data across multiple Pods or Serverless endpoints by attaching the same network volume.
- **Portable storage**: Move your working environment and data between different compute resources.
-- **Efficient data management**: Store frequently used models or large datasets to avoid re-downloading them for each new Pod or worker, saving time, bandwidth, and reducing cold start times.
+- **Efficient data management**: Store frequently used models or large datasets to avoid re-downloading them for each new Pod or , saving time, bandwidth, and reducing times.
## Pricing
@@ -94,7 +96,7 @@ To enable workers on an endpoint to use network volumes:
4. Click **Network Volumes** and select one or more network volumes you want to attach to the endpoint.
5. Configure any other fields as needed, then select **Save Endpoint**.
-Data from the attached network volume(s) will be accessible to workers from the `/runpod-volume` directory. Use this path to read and write shared data in your [handler function](/serverless/workers/handler-functions).
+Data from the attached network volume(s) will be accessible to workers from the `/runpod-volume` directory. Use this path to read and write shared data in your .
@@ -122,7 +124,7 @@ Data **does not sync** automatically between multiple network volumes even if th
## Network volumes for Pods
-When attached to a Pod, a network volume replaces the Pod's default volume disk and is typically mounted at `/workspace`.
+When attached to a , a network volume replaces the Pod's default volume disk and is typically mounted at `/workspace`.
Network volumes are only available for Pods in the Secure Cloud. For more information, see [Pod types](/pods/overview#pod-types).
@@ -150,7 +152,7 @@ You can attach a network volume to multiple Pods, allowing them to share data se
## Network volumes for Instant Clusters
-Network volumes for Instant Clusters work the same way as they do for Pods. They must be attached during cluster creation, and by default are mounted at `/workspace` within each node in the cluster.
+Network volumes for s work the same way as they do for Pods. They must be attached during cluster creation, and by default are mounted at `/workspace` within each node in the cluster.
### Attach to an Instant Cluster
diff --git a/tutorials/migrations/cog/overview.mdx b/tutorials/migrations/cog/overview.mdx
index 33495112..e92596c0 100644
--- a/tutorials/migrations/cog/overview.mdx
+++ b/tutorials/migrations/cog/overview.mdx
@@ -4,21 +4,23 @@ sidebar_label: Cog
description: Migrate your Cog model to Runpod
---
+import { ServerlessTooltip, EndpointTooltip, WorkerTooltip, HandlerFunctionTooltip } from "/snippets/tooltips.jsx";
+
To get started with Runpod:
* [Create a Runpod account](/get-started/manage-accounts)
* [Add funds](/references/billing-information)
-* [Use the Runpod SDK](/serverless/overview) to build and connect with your Serverless Endpoints
+* [Use the Runpod SDK](/serverless/overview) to build and connect with your s
-In this tutorial, you'll go through the process of migrating a model deployed via replicate.com or utilizing the Cog framework to a Runpod serverless worker.
+In this tutorial, you'll go through the process of migrating a model deployed via replicate.com or utilizing the Cog framework to a Runpod Serverless .
This guide assumes you are operating within a Linux terminal environment and have Docker installed on your system.
-This method might occur a delay when working with Runpod Serverless Endpoints. This delay is due to the FastAPI server that is used to run the Cog model.
+This method might occur a delay when working with Runpod Serverless endpoints. This delay is due to the FastAPI server that is used to run the Cog model.
-To eliminate this delay, consider using [Runpod Handler](/serverless/workers/handler-functions) functions in a future iteration.
+To eliminate this delay, consider using [Runpod ](/serverless/workers/handler-functions)s in a future iteration.
@@ -74,7 +76,7 @@ With your Docker image built and pushed, you're one step closer to deploying you
## Create and Deploy a Serverless Endpoint
-Now that your Docker image is ready, it's time to create and deploy a serverless endpoint on Runpod. This step will enable you to send requests to your new endpoint and use your Cog model in a serverless environment.
+Now that your Docker image is ready, it's time to create and deploy a Serverless on Runpod. This step will enable you to send requests to your new endpoint and use your Cog model in a Serverless environment.
To create and deploy a serverless endpoint on Runpod:
@@ -88,7 +90,7 @@ To create and deploy a serverless endpoint on Runpod:
ii. Select a GPU.
- iii. Configure the number of Workers.
+ iii. Configure the number of s.
iv. (optional) Select **FlashBoot**.
@@ -104,13 +106,17 @@ To create and deploy a serverless endpoint on Runpod:
Now, let's send a request to your [Endpoint](/serverless/endpoints/overview).
-Once your endpoint is set up and deployed, you'll be able to start receiving requests and utilize your Cog model in a serverless context.
+Once your endpoint is set up and deployed, you'll be able to start receiving requests and utilize your Cog model in a Serverless context.
## Conclusion
-Congratulations, you have successfully migrated your Cog model from Replicate to Runpod and set up a serverless endpoint. As you continue to develop your models and applications, consider exploring additional features and capabilities offered by Runpod to further enhance your projects.
+
+Congratulations! You've successfully migrated your Cog model from Replicate to Runpod and set up a Serverless endpoint.
+
+
+As you continue to develop your models and applications, consider exploring additional features and capabilities offered by Runpod to further enhance your projects.
Here are some resources to help you continue your journey:
-* [Learn more about Runpod serverless workers](/serverless/overview)
+* [Learn more about Runpod Serverless s](/serverless/overview)
* [Explore additional Runpod tutorials and examples](/tutorials/introduction/overview)
diff --git a/tutorials/migrations/openai/overview.mdx b/tutorials/migrations/openai/overview.mdx
index 152a99d3..838ccaba 100644
--- a/tutorials/migrations/openai/overview.mdx
+++ b/tutorials/migrations/openai/overview.mdx
@@ -4,18 +4,20 @@ sidebar_label: OpenAI
description: Migrate your OpenAI model to Runpod
---
+import { ServerlessTooltip, EndpointTooltip, WorkerTooltip, VLLMTooltip } from "/snippets/tooltips.jsx";
+
To get started with Runpod:
* [Create a Runpod account](/get-started/manage-accounts)
* [Add funds](/references/billing-information)
-* [Use the Runpod SDK](/serverless/overview) to build and connect with your Serverless Endpoints
+* [Use the Runpod SDK](/serverless/overview) to build and connect with your s
-This tutorial guides you through the steps necessary to modify your OpenAI Codebase for use with a deployed vLLM Worker on Runpod. You will learn to adjust your code to be compatible with OpenAI's API, specifically for utilizing Chat Completions, Completions, and Models routes. By the end of this guide, you will have successfully updated your codebase, enabling you to leverage the capabilities of OpenAI's API on Runpod.
+This tutorial guides you through the steps necessary to modify your OpenAI Codebase for use with a deployed on Runpod. You will learn to adjust your code to be compatible with OpenAI's API, specifically for utilizing Chat Completions, Completions, and Models routes. By the end of this guide, you will have successfully updated your codebase, enabling you to leverage the capabilities of OpenAI's API on Runpod.
To update your codebase, you need to replace the following:
* Your OpenAI API Key with your Runpod API Key
-* Your OpenAI Serverless Endpoint URL with your Runpod Serverless Endpoint URL
+* Your OpenAI Serverless endpoint URL with your Runpod Serverless endpoint URL
* Your OpenAI model with your custom LLM model deployed on Runpod
@@ -60,10 +62,12 @@ const chatCompletion = await openai.chat.completions.create({
-Congratulations on successfully modifying your OpenAI Codebase for use with your deployed vLLM Worker on Runpod! This tutorial has equipped you with the knowledge to update your code for compatibility with OpenAI's API and to utilize the full spectrum of features available on the Runpod platform.
+
+Congratulations! You've successfully modified your OpenAI codebase for use with your deployed vLLM worker on Runpod. You now know how to update your code for compatibility with OpenAI's API and utilize the full spectrum of features available on the Runpod platform.
+
## Next Steps
* [Explore more tutorials on Runpod](/tutorials/introduction/overview)
* [Learn more about OpenAI's API](https://platform.openai.com/docs/)
-* [Deploy your own vLLM Worker on Runpod](https://www.console.runpod.io/serverless)
+* [Deploy your own on Runpod](https://www.console.runpod.io/serverless)
diff --git a/tutorials/pods/build-docker-images.mdx b/tutorials/pods/build-docker-images.mdx
index 49dc2cd5..7e152c1e 100644
--- a/tutorials/pods/build-docker-images.mdx
+++ b/tutorials/pods/build-docker-images.mdx
@@ -5,13 +5,15 @@ description: "Build and push Docker images from inside a Runpod Pod using Bazel.
tag: "NEW"
---
+import { TemplateTooltip, PodTooltip, NetworkVolumeTooltip } from "/snippets/tooltips.jsx";
+
Runpod Pods use custom Docker images, so you can't directly build Docker containers or use Docker Compose on a GPU Pod. However, you can use [Bazel](https://bazel.build) to build and push Docker images from inside a Pod, effectively creating a "Docker in Docker" workflow.
## What you'll learn
In this tutorial, you'll learn how to:
-- Deploy a Pod for building Docker images.
+- Deploy a for building Docker images.
- Install Bazel and Docker dependencies.
- Configure a Bazel project with [rules_oci](https://github.com/bazel-contrib/rules_oci).
- Build and push a Docker image to Docker Hub.
@@ -28,8 +30,8 @@ Before starting, you'll need:
1. Navigate to [Pods](https://www.console.runpod.io/pods) and select **+ Deploy**.
2. Choose **GPU** or **CPU** based on your needs.
3. Select an instance type (for example, **A40**).
-4. (optional) Attach a [network volume](/storage/network-volumes) for larger image builds.
-5. Select a template (for example, **Runpod Pytorch**).
+4. (optional) Attach a for larger image builds.
+5. Select a (for example, **Runpod Pytorch**).
6. Select **Deploy On-Demand**.
Wait for the Pod to start, then connect via the web terminal:
diff --git a/tutorials/pods/comfyui.mdx b/tutorials/pods/comfyui.mdx
index b362fb74..e259aedd 100644
--- a/tutorials/pods/comfyui.mdx
+++ b/tutorials/pods/comfyui.mdx
@@ -4,11 +4,13 @@ sidebarTitle: "Generate images with ComfyUI"
description: "Deploy ComfyUI on Runpod to create AI-generated images."
---
+import { TemplateTooltip, PodTooltip, NetworkVolumeTooltip, RunpodCLITooltip } from "/snippets/tooltips.jsx";
+
This tutorial walks you through how to configure ComfyUI on a [GPU Pod](/pods/overview) and use it to generate images with text-to-image models.
[ComfyUI](https://www.comfy.org/) is a node-based graphical interface for creating AI image generation workflows. Instead of writing code, you connect different components visually to build custom image generation pipelines. This approach provides flexibility to experiment with various models and techniques while maintaining an intuitive interface.
-This tutorial uses the [SDXL-Turbo](https://huggingface.co/stabilityai/sdxl-turbo) model and a matching template, but you can adapt these instructions for any model/template combination you want to use.
+This tutorial uses the [SDXL-Turbo](https://huggingface.co/stabilityai/sdxl-turbo) model and a matching , but you can adapt these instructions for any model/template combination you want to use.
When you're just getting started with ComfyUI, it's important to use a workflow that was created for the specific model you intend to use. You usually can't just switch the "Load Checkpoint" node from one model to another and expect optimal performance or results.
@@ -20,7 +22,7 @@ For example, if you load a workflow created for the Flux Dev model and try to us
In this tutorial, you'll learn how to:
-- Deploy a Pod with ComfyUI pre-installed.
+- Deploy a with ComfyUI pre-installed.
- Connect to the ComfyUI web interface.
- Browse pre-configured workflow templates.
- Install new models to your Pod.
@@ -167,7 +169,9 @@ Your workflow is now ready! Follow these steps to generate an image:
+
Congratulations! You've just generated your first image with ComfyUI on Runpod.
+
## Troubleshooting
@@ -190,7 +194,7 @@ Use the template browser from [Step 3](#step-3%3A-load-a-workflow-template) to t
You can also browse the web for a preconfigured workflow and import it by clicking **Workflow** in the top right corner of the ComfyUI interface, selecting **Open**, then selecting the workflow file you want to import.
-Don't forget to install any missing models using the model manager. If you need a model that isn't available in the model manager, you can download it from the web to your local machine, then use the [Runpod CLI](/runpodctl/overview) to transfer the model files directly into your Pod's `/workspace/madapps/ComfyUI/models` directory.
+Don't forget to install any missing models using the model manager. If you need a model that isn't available in the model manager, you can download it from the web to your local machine, then use the to transfer the model files directly into your Pod's `/workspace/madapps/ComfyUI/models` directory.
### Create custom workflows
@@ -206,4 +210,4 @@ While working with ComfyUI, you can monitor your usage by checking GPU/disk util
Stop your Pod when you're finished to avoid unnecessary charges.
-It's also a good practice to download any custom workflows to your local machine before stopping the Pod. For persistent storage of models and outputs across sessions, consider using a [network volume](/storage/network-volumes).
\ No newline at end of file
+It's also a good practice to download any custom workflows to your local machine before stopping the Pod. For persistent storage of models and outputs across sessions, consider using a .
\ No newline at end of file
diff --git a/tutorials/pods/run-ollama.mdx b/tutorials/pods/run-ollama.mdx
index 8d3a6861..d706a374 100644
--- a/tutorials/pods/run-ollama.mdx
+++ b/tutorials/pods/run-ollama.mdx
@@ -5,13 +5,15 @@ description: "Install and run Ollama on a Pod with HTTP API access."
tag: "NEW"
---
-This tutorial shows you how to set up [Ollama](https://ollama.com), a platform for running large language models, on a Runpod GPU Pod. By the end, you'll have Ollama running with HTTP API access for external requests.
+import { TemplateTooltip, PodTooltip, PodEnvironmentVariablesTooltip } from "/snippets/tooltips.jsx";
+
+This tutorial shows you how to set up [Ollama](https://ollama.com), a platform for running large language models, on a Runpod GPU . By the end, you'll have Ollama running with HTTP API access for external requests.
## What you'll learn
In this tutorial, you'll learn how to:
-- Deploy a Pod with the PyTorch template.
+- Deploy a Pod with the PyTorch .
- Install and configure Ollama for external access.
- Run AI models and interact via the HTTP API.
@@ -26,7 +28,7 @@ In this tutorial, you'll learn how to:
3. Select the latest **PyTorch** template.
4. Under **Pod Template**, select **Edit**:
- Under **Expose HTTP Ports (Max 10)**, add port `11434`.
- - Under **Environment Variables**, add an environment variable with key `OLLAMA_HOST` and value `0.0.0.0`.
+ - Under ****, add a variable with key `OLLAMA_HOST` and value `0.0.0.0`.
5. Click **Set Overrides** and then **Deploy On-Demand**.
## Step 2: Install Ollama
@@ -83,7 +85,9 @@ curl -X POST https://OLLAMA_POD_ID-11434.proxy.runpod.net/api/generate -d '{
}'
```
+
Congratulations! You've set up Ollama on a Runpod Pod and made HTTP API requests to it.
+
For more API options, see the [Ollama API documentation](https://github.com/ollama/ollama/blob/main/docs/api.md).
diff --git a/tutorials/pods/run-your-first.mdx b/tutorials/pods/run-your-first.mdx
index 49ee8b69..52d8d98a 100644
--- a/tutorials/pods/run-your-first.mdx
+++ b/tutorials/pods/run-your-first.mdx
@@ -4,7 +4,9 @@ sidebarTitle: "Run LLMs with JupyterLab"
description: "Learn how to run inference on the SmolLM3 model in JupyterLab using the transformers library."
---
-This tutorial shows how to deploy a Pod and use JupyterLab to generate text with the SmolLM3 model using the Python `transformers` library.
+import { TemplateTooltip, PodTooltip, NetworkVolumeTooltip } from "/snippets/tooltips.jsx";
+
+This tutorial shows how to deploy a and use JupyterLab to generate text with the SmolLM3 model using the Python `transformers` library.
[SmolLM3](https://huggingface.co/docs/transformers/en/model_doc/smollm3) is a family of small language models developed by Hugging Face that provides strong performance while being efficient enough to run on modest hardware.
The 3B parameter model we'll use in this tutorial requires only 24 GB of VRAM, making it accessible for experimentation and development.
@@ -13,7 +15,7 @@ The 3B parameter model we'll use in this tutorial requires only 24 GB of VRAM, m
In this tutorial, you'll learn how to:
-- Deploy a Pod with the PyTorch template.
+- Deploy a Pod with the PyTorch .
- Access the web terminal and JupyterLab services.
- Install the transformers and accelerate libraries.
- Use SmolLM3 for text generation in a Python notebook.
@@ -180,4 +182,4 @@ Now that you have SmolLM3 running, you can explore more advanced use cases:
- **Integration with applications**: Use SmolLM3 as part of larger applications by integrating it with web frameworks or APIs.
- **Model comparison**: Try other models in the SmolLM3 family or compare with other small language models to find the best fit for your use case.
-- **Persistent storage**: If you plan to work with SmolLM3 regularly, consider using a [network volume](/storage/network-volumes) to persist your models and notebooks across Pod sessions.
\ No newline at end of file
+- **Persistent storage**: If you plan to work with SmolLM3 regularly, consider using a to persist your models and notebooks across Pod sessions.
\ No newline at end of file
diff --git a/tutorials/sdks/python/101/aggregate.mdx b/tutorials/sdks/python/101/aggregate.mdx
index 25fe33b2..9c693cbc 100644
--- a/tutorials/sdks/python/101/aggregate.mdx
+++ b/tutorials/sdks/python/101/aggregate.mdx
@@ -3,7 +3,9 @@ title: "Aggregating outputs in Runpod serverless functions"
sidebarTitle: "Aggregating outputs"
---
-This tutorial will guide you through using the `return_aggregate_stream` feature in Runpod to simplify result handling in your serverless functions. Using `return_aggregate_stream` allows you to automatically collect and aggregate all yielded results from a generator handler into a single response. This simplifies result handling, making it easier to manage and return a consolidated set of results from asynchronous tasks, such as concurrent sentiment analysis or object detection, without needing additional code to collect and format the results manually.
+import { ServerlessTooltip, WorkerTooltip, HandlerFunctionTooltip } from "/snippets/tooltips.jsx";
+
+This tutorial will guide you through using the `return_aggregate_stream` feature in Runpod to simplify result handling in your functions. Using `return_aggregate_stream` allows you to automatically collect and aggregate all yielded results from a generator into a single response. This simplifies result handling, making it easier to manage and return a consolidated set of results from asynchronous tasks, such as concurrent sentiment analysis or object detection, without needing additional code to collect and format the results manually.
We'll create a multi-purpose analyzer that can perform sentiment analysis on text and object detection in images, demonstrating how to aggregate outputs efficiently.
@@ -47,7 +49,7 @@ These functions:
1. Simulate sentiment analysis, returning a random sentiment and score
2. Simulate object detection, returning a list of detected objects with confidence scores
-### Create the main Handler Function
+### Create the main
Now, let's create the main handler function that processes jobs and yields results:
@@ -81,9 +83,9 @@ This handler:
3. Yields results incrementally
4. Returns the complete list of results
-### Set up the Serverless Function starter
+### Set up the Serverless starter
-Create a function to start the serverless handler with proper configuration:
+Create a function to start the Serverless handler with proper configuration:
```python
def start_handler():
@@ -221,13 +223,13 @@ INFO | Local testing complete, exiting.
This output demonstrates:
-1. The serverless worker starting and processing the job
+1. The Serverless starting and processing the job
2. The handler generating results for each input item
3. The aggregation of results into a single list
## Conclusion
-You've now created a serverless function using Runpod's Python SDK that demonstrates efficient output aggregation for both local testing and production environments. This approach simplifies result handling and ensures consistent behavior across different execution contexts.
+You've now created a Serverless function using Runpod's Python SDK that demonstrates efficient output aggregation for both local testing and production environments. This approach simplifies result handling and ensures consistent behavior across different execution contexts.
To further enhance this application, consider:
@@ -235,4 +237,4 @@ To further enhance this application, consider:
* Adding error handling and logging for each processing step
* Exploring Runpod's advanced features for handling larger datasets or parallel processing
-Runpod's serverless library, with features like `return_aggregate_stream`, provides a powerful foundation for building scalable, efficient applications that can process and aggregate data seamlessly.
+Runpod's Serverless library, with features like `return_aggregate_stream`, provides a powerful foundation for building scalable, efficient applications that can process and aggregate data seamlessly.
diff --git a/tutorials/sdks/python/101/async.mdx b/tutorials/sdks/python/101/async.mdx
index c95b1393..e5fafe44 100644
--- a/tutorials/sdks/python/101/async.mdx
+++ b/tutorials/sdks/python/101/async.mdx
@@ -3,11 +3,13 @@ title: "Building an async generator handler for weather data simulation"
sidebarTitle: "Async generator"
---
-This tutorial will guide you through creating a serverless function using Runpod's Python SDK that simulates fetching weather data for multiple cities concurrently.
+import { ServerlessTooltip, WorkerTooltip, HandlerFunctionTooltip } from "/snippets/tooltips.jsx";
-Use asynchronous functions to handle multiple concurrent operations efficiently, especially when dealing with tasks that involve waiting for external resources, such as network requests or I/O operations. Asynchronous programming allows your code to perform other tasks while waiting, rather than blocking the entire program. This is particularly useful in a serverless environment where you want to maximize resource utilization and minimize response times.
+This tutorial will guide you through creating a function using Runpod's Python SDK that simulates fetching weather data for multiple cities concurrently.
-We'll use an async generator handler to stream results incrementally, demonstrating how to manage multiple concurrent operations efficiently in a serverless environment.
+Use asynchronous functions to handle multiple concurrent operations efficiently, especially when dealing with tasks that involve waiting for external resources, such as network requests or I/O operations. Asynchronous programming allows your code to perform other tasks while waiting, rather than blocking the entire program. This is particularly useful in a Serverless environment where you want to maximize resource utilization and minimize response times.
+
+We'll use an async generator to stream results incrementally, demonstrating how to manage multiple concurrent operations efficiently in a Serverless environment.
## Setting up your Serverless Function
@@ -104,7 +106,7 @@ if __name__ == "__main__":
})
```
-This block allows for both local testing and deployment as a Runpod serverless function.
+This block allows for both local testing and deployment as a Runpod Serverless function.
## Complete code example
@@ -213,7 +215,7 @@ This output demonstrates:
## Conclusion
-You've now created a serverless function using Runpod's Python SDK that simulates concurrent weather data fetching for multiple cities. This example showcases how to handle multiple asynchronous operations and stream results incrementally in a serverless environment.
+You've now created a Serverless function using Runpod's Python SDK that simulates concurrent weather data fetching for multiple cities. This example showcases how to handle multiple asynchronous operations and stream results incrementally in a Serverless environment.
To further enhance this application, consider:
@@ -221,4 +223,4 @@ To further enhance this application, consider:
* Adding error handling for network failures or API limits
* Exploring Runpod's documentation for advanced features like scaling for high-concurrency scenarios
-Runpod's serverless library provides a powerful foundation for building scalable, efficient applications that can process and stream data concurrently in real-time without the need to manage infrastructure.
+Runpod's Serverless library provides a powerful foundation for building scalable, efficient applications that can process and stream data concurrently in real-time without the need to manage infrastructure.
diff --git a/tutorials/sdks/python/101/error.mdx b/tutorials/sdks/python/101/error.mdx
index 96d79f36..20ec54ee 100644
--- a/tutorials/sdks/python/101/error.mdx
+++ b/tutorials/sdks/python/101/error.mdx
@@ -3,11 +3,13 @@ title: "Implementing error handling and logging in Runpod serverless functions"
sidebarTitle: "Error handling"
---
-This tutorial will guide you through implementing effective error handling and logging in your Runpod serverless functions.
+import { ServerlessTooltip, WorkerTooltip, HandlerFunctionTooltip } from "/snippets/tooltips.jsx";
-Proper error handling ensures that your serverless functions can handle unexpected situations gracefully. This prevents crashes and ensures that your application can continue running smoothly, even if some parts encounter issues.
+This tutorial will guide you through implementing effective error handling and logging in your Runpod functions.
-We'll create a simulated image classification model to demonstrate these crucial practices, ensuring your serverless deployments are robust and maintainable.
+Proper error handling ensures that your Serverless functions can handle unexpected situations gracefully. This prevents crashes and ensures that your application can continue running smoothly, even if some parts encounter issues.
+
+We'll create a simulated image classification model to demonstrate these crucial practices, ensuring your Serverless deployments are robust and maintainable.
## Setting up your Serverless Function
@@ -59,7 +61,7 @@ These functions:
2. Preprocess images, with debug logging
3. Classify images, returning random results for demonstration
-### Create the Main Handler Function
+### Create the Main
Now, let's create the main handler function with error handling and logging:
@@ -127,9 +129,9 @@ This handler:
3. Simulates image classification with progress logging
4. Returns results or an error message based on the execution
-### Start the Serverless Function
+### Start the Serverless
-Finally, start the Runpod serverless function:
+Finally, start the Runpod Serverless function:
```python
runpod.serverless.start({"handler": handler})
@@ -283,7 +285,7 @@ This output demonstrates:
## Conclusion
-You've now created a serverless function using Runpod's Python SDK that demonstrates effective error handling and logging practices. This approach ensures that your serverless functions are robust, maintainable, and easier to debug.
+You've now created a Serverless function using Runpod's Python SDK that demonstrates effective error handling and logging practices. This approach ensures that your Serverless functions are robust, maintainable, and easier to debug.
To further enhance this application, consider:
@@ -291,4 +293,4 @@ To further enhance this application, consider:
* Adding more detailed logging for each step of the process
* Exploring Runpod's advanced logging features and integrations
-Runpod's serverless library provides a powerful foundation for building reliable, scalable applications with comprehensive error management and logging capabilities.
+Runpod's Serverless library provides a powerful foundation for building reliable, scalable applications with comprehensive error management and logging capabilities.
diff --git a/tutorials/sdks/python/101/generator.mdx b/tutorials/sdks/python/101/generator.mdx
index 1b2deb6e..dae61319 100644
--- a/tutorials/sdks/python/101/generator.mdx
+++ b/tutorials/sdks/python/101/generator.mdx
@@ -3,9 +3,11 @@ title: "Building a streaming handler for text to speech simulation"
sidebarTitle: "Streaming handler"
---
-This tutorial will guide you through creating a serverless function using Runpod's Python SDK that simulates a text-to-speech (TTS) process. We'll use a streaming handler to stream results incrementally, demonstrating how to handle long-running tasks efficiently in a serverless environment.
+import { ServerlessTooltip, WorkerTooltip, HandlerFunctionTooltip } from "/snippets/tooltips.jsx";
-A streaming handler in the Runpod's Python SDK is a special type of function that allows you to iterate over a sequence of values lazily. Instead of returning a single value and exiting, a streaming handler yields multiple values, one at a time, pausing the function's state between each yield. This is particularly useful for handling large data streams or long-running tasks, as it allows the function to produce and return results incrementally, rather than waiting until the entire process is complete.
+This tutorial will guide you through creating a function using Runpod's Python SDK that simulates a text-to-speech (TTS) process. We'll use a streaming handler to stream results incrementally, demonstrating how to handle long-running tasks efficiently in a serverless environment.
+
+A streaming in the Runpod's Python SDK is a special type of function that allows you to iterate over a sequence of values lazily. Instead of returning a single value and exiting, a streaming handler yields multiple values, one at a time, pausing the function's state between each yield. This is particularly useful for handling large data streams or long-running tasks, as it allows the function to produce and return results incrementally, rather than waiting until the entire process is complete.
## Setting up your Serverless Function
@@ -84,7 +86,7 @@ if __name__ == "__main__":
runpod.serverless.start({"handler": streaming_handler, "return_aggregate_stream": True})
```
-This block allows for both local testing and deployment as a Runpod serverless function.
+This block allows for both local testing and deployment as a Runpod Serverless function.
## Complete code example
@@ -177,7 +179,7 @@ This output demonstrates:
## Conclusion
-You've now created a serverless function using Runpod's Python SDK that simulates a streaming text-to-speech process. This example showcases how to handle long-running tasks and stream results incrementally in a serverless environment.
+You've now created a Serverless function using Runpod's Python SDK that simulates a streaming text-to-speech process. This example showcases how to handle long-running tasks and stream results incrementally in a Serverless environment.
To further enhance this application, consider:
@@ -185,4 +187,4 @@ To further enhance this application, consider:
* Adding error handling for various input types
* Exploring Runpod's documentation for advanced features like GPU acceleration for audio processing
-Runpod's serverless library provides a powerful foundation for building scalable, efficient applications that can process and stream data in real-time without the need to manage infrastructure.
+Runpod's Serverless library provides a powerful foundation for building scalable, efficient applications that can process and stream data in real-time without the need to manage infrastructure.
diff --git a/tutorials/sdks/python/101/hello.mdx b/tutorials/sdks/python/101/hello.mdx
index 10f16a1c..766f93ca 100644
--- a/tutorials/sdks/python/101/hello.mdx
+++ b/tutorials/sdks/python/101/hello.mdx
@@ -3,7 +3,9 @@ title: "Create a basic Serverless function"
sidebarTitle: "Create a basic Serverless function"
---
-Runpod's serverless library enables you to create and deploy scalable functions without managing infrastructure. This tutorial will walk you through creating a simple serverless function that determines whether a number is even.
+import { ServerlessTooltip, WorkerTooltip, HandlerFunctionTooltip } from "/snippets/tooltips.jsx";
+
+Runpod's library enables you to create and deploy scalable functions without managing infrastructure. This tutorial will walk you through creating a simple serverless function that determines whether a number is even.
## Creating a Basic Serverless Function
@@ -41,7 +43,7 @@ This function:
3. Returns an error message if it's not an integer
4. Determines if the number is even and returns the result
-### Start the Serverless function
+### Start the Serverless
Wrap your function with `runpod.serverless.start()`:
@@ -49,7 +51,7 @@ Wrap your function with `runpod.serverless.start()`:
runpod.serverless.start({"handler": is_even})
```
-This line initializes the serverless function with your specified handler.
+This line initializes the Serverless function with your specified .
## Complete code example
@@ -103,7 +105,7 @@ This output indicates that:
## Conclusion
-You've now created a basic serverless function using Runpod's Python SDK. This approach allows for efficient, scalable deployment of functions without the need to manage infrastructure.
+You've now created a basic Serverless function using Runpod's Python SDK. This approach allows for efficient, scalable deployment of functions without the need to manage infrastructure.
To further explore Runpod's serverless capabilities, consider:
@@ -111,4 +113,4 @@ To further explore Runpod's serverless capabilities, consider:
* Implementing error handling and input validation
* Exploring Runpod's documentation for advanced features and best practices
-Runpod's serverless library provides a powerful tool for a wide range of applications, from simple utilities to complex data processing tasks.
+Runpod's Serverless library provides a powerful tool for a wide range of applications, from simple utilities to complex data processing tasks.
diff --git a/tutorials/sdks/python/101/local-server-testing.mdx b/tutorials/sdks/python/101/local-server-testing.mdx
index 58950361..fcf0ac1c 100644
--- a/tutorials/sdks/python/101/local-server-testing.mdx
+++ b/tutorials/sdks/python/101/local-server-testing.mdx
@@ -3,7 +3,9 @@ title: "Creating and testing a Runpod serverless function with local server"
sidebarTitle: "Local server testing"
---
-This tutorial will guide you through creating a basic serverless function using Runpod's Python SDK. We'll build a function that reverses a given string, demonstrating the simplicity and flexibility of Runpod's serverless architecture.
+import { ServerlessTooltip, WorkerTooltip, HandlerFunctionTooltip } from "/snippets/tooltips.jsx";
+
+This tutorial will guide you through creating a basic function using Runpod's Python SDK. We'll build a function that reverses a given string, demonstrating the simplicity and flexibility of Runpod's Serverless architecture.
## Setting up your Serverless Function
@@ -28,9 +30,9 @@ def reverse_string(s):
This function uses Python's slicing feature to efficiently reverse the input string.
-### Create the Handler Function
+### Create the
-The handler function is the core of our serverless application:
+The handler function is the core of our Serverless application:
```python
def handler(job):
@@ -57,15 +59,15 @@ This handler:
4. Reverses the string using our utility function
5. Prepares and returns the output
-### Start the Serverless Function
+### Start the Serverless
-Finally, start the Runpod serverless worker:
+Finally, start the Runpod Serverless worker:
```python
runpod.serverless.start({"handler": handler})
```
-This line registers our handler function with Runpod's serverless infrastructure.
+This line registers our handler function with Runpod's Serverless infrastructure.
## Complete code example
@@ -100,7 +102,7 @@ runpod.serverless.start({"handler": handler})
## Testing Your Serverless Function
-Runpod provides multiple ways to test your serverless function locally before deployment. We'll explore two methods: using command-line arguments and running a local test server.
+Runpod provides multiple ways to test your Serverless function locally before deployment. We'll explore two methods: using command-line arguments and running a local test server.
### Method 1: Command-line Testing
@@ -125,11 +127,11 @@ INFO | Job result: {'output': {'original_text': 'Hello, Runpod!', 'reversed_te
INFO | Local testing complete, exiting.
```
-This output shows the serverless worker starting, processing the job, and returning the result.
+This output shows the Serverless starting, processing the job, and returning the result.
### Method 2: Local Test Server
-For more comprehensive testing, especially when you want to simulate HTTP requests to your serverless function, you can launch a local test server. This server provides an endpoint that you can send requests to, mimicking the behavior of a deployed serverless function.
+For more comprehensive testing, especially when you want to simulate HTTP requests to your Serverless function, you can launch a local test server. This server provides an endpoint that you can send requests to, mimicking the behavior of a deployed Serverless function.
To start the local test server, use the `--rp_serve_api` flag:
@@ -167,11 +169,11 @@ DEBUG | local_test | run_job return: {'output': {'original_text': 'Hello, Run
INFO | Job local_test completed successfully.
```
-This output provides detailed information about how your function processes the request, which can be invaluable for debugging and optimizing your serverless function.
+This output provides detailed information about how your function processes the request, which can be invaluable for debugging and optimizing your Serverless function.
## Conclusion
-You've now created a basic serverless function using Runpod's Python SDK that reverses input strings and learned how to test it using both command-line arguments and a local test server. This example demonstrates how easy it is to deploy and validate simple text processing tasks as serverless functions.
+You've now created a basic Serverless function using Runpod's Python SDK that reverses input strings and learned how to test it using both command-line arguments and a local test server. This example demonstrates how easy it is to deploy and validate simple text processing tasks as Serverless functions.
To further explore Runpod's serverless capabilities, consider:
@@ -181,4 +183,4 @@ To further explore Runpod's serverless capabilities, consider:
* Using the local server to integrate your function with other parts of your application during development
* Exploring Runpod's documentation for advanced features like concurrent processing or GPU acceleration
-Runpod's serverless library provides a powerful foundation for building scalable, efficient text processing applications without the need to manage infrastructure.
+Runpod's Serverless library provides a powerful foundation for building scalable, efficient text processing applications without the need to manage infrastructure.
diff --git a/tutorials/sdks/python/102/huggingface-models.mdx b/tutorials/sdks/python/102/huggingface-models.mdx
index 0e206c78..e1141369 100644
--- a/tutorials/sdks/python/102/huggingface-models.mdx
+++ b/tutorials/sdks/python/102/huggingface-models.mdx
@@ -3,7 +3,9 @@ title: "Using Hugging Face models with Runpod"
sidebarTitle: "Hugging Face models"
---
-Artificial Intelligence (AI) has revolutionized how applications analyze and interact with data. One powerful aspect of AI is sentiment analysis, which allows machines to interpret and categorize emotions expressed in text. In this tutorial, you will learn how to integrate pre-trained Hugging Face models into your Runpod Serverless applications to perform sentiment analysis. By the end of this guide, you will have a fully functional AI-powered sentiment analysis function running in a serverless environment.
+import { ServerlessTooltip, WorkerTooltip, HandlerFunctionTooltip } from "/snippets/tooltips.jsx";
+
+Artificial Intelligence (AI) has revolutionized how applications analyze and interact with data. One powerful aspect of AI is sentiment analysis, which allows machines to interpret and categorize emotions expressed in text. In this tutorial, you will learn how to integrate pre-trained Hugging Face models into your Runpod applications to perform sentiment analysis. By the end of this guide, you will have a fully functional AI-powered sentiment analysis function running in a Serverless environment.
### Install Required Libraries
@@ -26,11 +28,11 @@ import runpod
from transformers import pipeline
```
-These imports bring in the `runpod` SDK for serverless functions and the `pipeline` method from `transformers`, which allows us to use pre-trained models.
+These imports bring in the `runpod` SDK for Serverless functions and the `pipeline` method from `transformers`, which allows us to use pre-trained models.
### Load the Model
-Loading the model in a function ensures that the model is only loaded once when the worker starts, optimizing the performance of our application. Add the following code to your `sentiment_analysis.py` file:
+Loading the model in a function ensures that the model is only loaded once when the starts, optimizing the performance of our application. Add the following code to your `sentiment_analysis.py` file:
```python sentiment_analysis.py
def load_model():
@@ -41,7 +43,7 @@ def load_model():
In this function, we use the `pipeline` method from `transformers` to load a pre-trained sentiment analysis model. The `distilbert-base-uncased-finetuned-sst-2-english` model is a distilled version of BERT fine-tuned for sentiment analysis tasks.
-### Define the Handler Function
+### Define the
We will now define the handler function that will process incoming events and use the model for sentiment analysis. Add the following code to your script:
@@ -74,15 +76,15 @@ This function performs the following steps:
4. Uses the loaded model to perform sentiment analysis.
5. Returns the sentiment label and score as a dictionary.
-### Start the Serverless Worker
+### Start the Serverless
-To run our sentiment analysis function as a serverless worker, we need to start the worker using Runpod's SDK. Add the following line at the end of your `sentiment_analysis.py` file:
+To run our sentiment analysis function as a Serverless , we need to start the worker using Runpod's SDK. Add the following line at the end of your `sentiment_analysis.py` file:
```python sentiment_analysis.py
runpod.serverless.start({"handler": sentiment_analysis_handler})
```
-This command starts the serverless worker and specifies `sentiment_analysis_handler` as the handler function for incoming requests.
+This command starts the Serverless worker and specifies `sentiment_analysis_handler` as the for incoming requests.
### Complete Code
@@ -156,9 +158,9 @@ INFO | Local testing complete, exiting.
## Conclusion
-In this tutorial, you learned how to integrate a pre-trained Hugging Face model into a Runpod serverless function to perform sentiment analysis on text input.
+In this tutorial, you learned how to integrate a pre-trained Hugging Face model into a Runpod Serverless function to perform sentiment analysis on text input.
-This powerful combination enables you to create advanced AI applications in a serverless environment.
+This powerful combination enables you to create advanced AI applications in a Serverless environment.
You can extend this concept to use more complex models or perform different types of inference tasks as needed.
diff --git a/tutorials/sdks/python/102/stable-diffusion-text-to-image.mdx b/tutorials/sdks/python/102/stable-diffusion-text-to-image.mdx
index 7500e050..eb9fb209 100644
--- a/tutorials/sdks/python/102/stable-diffusion-text-to-image.mdx
+++ b/tutorials/sdks/python/102/stable-diffusion-text-to-image.mdx
@@ -3,9 +3,11 @@ title: "Text To Image Generation with Stable Diffusion on Runpod"
sidebarTitle: "Stable Diffusion text to image"
---
-Text-to-image generation using advanced AI models offers a unique way to bring textual descriptions to life as images. Stable Diffusion is a powerful model capable of generating high-quality images from text inputs, and Runpod is a serverless computing platform that can manage resource-intensive tasks effectively. This tutorial will guide you through setting up a serverless application that utilizes Stable Diffusion for generating images from text prompts on Runpod.
+import { ServerlessTooltip, WorkerTooltip, HandlerFunctionTooltip } from "/snippets/tooltips.jsx";
-By the end of this guide, you will have a fully functional text-to-image generation system deployed on a Runpod serverless environment.
+Text-to-image generation using advanced AI models offers a unique way to bring textual descriptions to life as images. Stable Diffusion is a powerful model capable of generating high-quality images from text inputs, and Runpod is a computing platform that can manage resource-intensive tasks effectively. This tutorial will guide you through setting up a serverless application that utilizes Stable Diffusion for generating images from text prompts on Runpod.
+
+By the end of this guide, you will have a fully functional text-to-image generation system deployed on a Runpod Serverless environment.
## Prerequisites
@@ -29,7 +31,7 @@ import base64
Here’s a breakdown of the imports:
-* `runpod`: The SDK used to interact with Runpod's serverless environment.
+* `runpod`: The SDK used to interact with Runpod's Serverless environment.
* `torch`: PyTorch library, necessary for running deep learning models and ensuring they utilize the GPU.
* `diffusers`: Provides methods to work with diffusion models like Stable Diffusion.
* `BytesIO` and `base64`: Used to handle image data conversions.
@@ -46,7 +48,7 @@ This assertion checks whether a compatible NVIDIA GPU is available for PyTorch t
## Load the Stable Diffusion Model
-We'll load the Stable Diffusion model in a separate function. This ensures that the model is only loaded once when the worker process starts, which is more efficient.
+We'll load the Stable Diffusion model in a separate function. This ensures that the model is only loaded once when the process starts, which is more efficient.
```python stable_diffusion.py
def load_model():
@@ -78,7 +80,7 @@ Explanation:
* `BytesIO`: Creates an in-memory binary stream to which the image is saved.
* `base64.b64encode`: Encodes the binary data to a base64 format, which is then decoded to a UTF-8 string.
-## Define the Handler Function
+## Define the
The handler function will be responsible for managing image generation requests. It includes loading the model (if not already loaded), validating inputs, generating images, and converting them to base64 strings.
@@ -118,15 +120,15 @@ Key steps in the function:
* Uses the `model` to generate an image.
* Converts the image to base64 and prepares the response.
-## Start the Serverless Worker
+## Start the Serverless
-Now, we'll start the serverless worker using the Runpod SDK.
+Now, we'll start the Serverless worker using the Runpod SDK.
```python stable_diffusion.py
runpod.serverless.start({"handler": stable_diffusion_handler})
```
-This command starts the serverless worker and specifies the `stable_diffusion_handler` function to handle incoming requests.
+This command starts the Serverless worker and specifies the `stable_diffusion_handler` function to handle incoming requests.
## Complete Code
@@ -205,7 +207,7 @@ Note: Local testing may not work optimally without a suitable GPU. If issues ari
## Important Notes:
1. This example requires significant computational resources, particularly GPU memory. Ensure your Runpod configuration has sufficient GPU capabilities.
-2. The model is loaded only once when the worker starts, optimizing performance.
+2. The model is loaded only once when the starts, optimizing performance.
3. We've used Stable Diffusion v1.5; you can replace it with other versions or models as required.
4. The handler includes error handling for missing input and exceptions during processing.
5. Ensure necessary dependencies (like `torch`, `diffusers`) are included in your environment or requirements file when deploying.
@@ -213,4 +215,4 @@ Note: Local testing may not work optimally without a suitable GPU. If issues ari
### Conclusion
-In this tutorial, you learned how to use the Runpod serverless platform with Stable Diffusion to create a text-to-image generation system. This project showcases the potential for deploying resource-intensive AI models in a serverless architecture using the Runpod Python SDK. You now have the skills to create and deploy sophisticated AI applications on Runpod. What will you create next?
+In this tutorial, you learned how to use the Runpod Serverless platform with Stable Diffusion to create a text-to-image generation system. This project showcases the potential for deploying resource-intensive AI models in a Serverless architecture using the Runpod Python SDK. You now have the skills to create and deploy sophisticated AI applications on Runpod. What will you create next?
diff --git a/tutorials/sdks/python/get-started/hello-world.mdx b/tutorials/sdks/python/get-started/hello-world.mdx
index 50b8f950..823c6ec4 100644
--- a/tutorials/sdks/python/get-started/hello-world.mdx
+++ b/tutorials/sdks/python/get-started/hello-world.mdx
@@ -2,9 +2,11 @@
title: "Hello World with Runpod"
---
-Let's dive into creating your first Runpod Serverless application. We're going to build a "Hello, World!" program that greets users with a custom message. Don't worry about sending requests just yet - we'll cover that in the next tutorial, [running locally](/tutorials/sdks/python/get-started/running-locally).
+import { ServerlessTooltip, WorkerTooltip, HandlerFunctionTooltip } from "/snippets/tooltips.jsx";
-This exercise will introduce you to the key parts of a Runpod application, giving you a solid foundation in serverless functions. By the end, you'll have your very own Runpod serverless function up and running locally.
+Let's dive into creating your first Runpod application. We're going to build a "Hello, World!" program that greets users with a custom message. Don't worry about sending requests just yet - we'll cover that in the next tutorial, [running locally](/tutorials/sdks/python/get-started/running-locally).
+
+This exercise will introduce you to the key parts of a Runpod application, giving you a solid foundation in Serverless functions. By the end, you'll have your very own Runpod Serverless function up and running locally.
### Creating Your First Serverless Function
@@ -35,18 +37,18 @@ Inside the handler, we grab the input data from the job. We're expecting a 'name
Then we create and return our greeting message, using the name we got from the input.
-Finally, we call `runpod.serverless.start()`, telling it to use our `handler` function. This kicks off the serverless worker and gets it ready to handle incoming jobs.
+Finally, we call `runpod.serverless.start()`, telling it to use our `handler` function. This kicks off the Serverless and gets it ready to handle incoming jobs.
-And there you have it! You've just created your first Runpod serverless function. It takes in a request with a name and returns a personalized greeting.
+And there you have it! You've just created your first Runpod Serverless function. It takes in a request with a name and returns a personalized greeting.
### Key Takeaways
-* Runpod functions are built around a handler that processes incoming jobs.
+* Runpod functions are built around a that processes incoming jobs.
* You can easily access input data from the job parameter.
-* The `runpod.serverless.start()` function gets your serverless worker up and running.
+* The `runpod.serverless.start()` function gets your Serverless up and running.
## Next steps
-You've now got a basic `Hello, World!` Runpod serverless function up and running. You've learned how to handle input and output in a serverless environment and how to start your application.
+You've now got a basic `Hello, World!` Runpod Serverless function up and running. You've learned how to handle input and output in a Serverless environment and how to start your application.
-These are the building blocks for creating more complex serverless applications with Runpod. As you get more comfortable with these concepts, you'll be able to create even more powerful and flexible serverless functions.
+These are the building blocks for creating more complex Serverless applications with Runpod. As you get more comfortable with these concepts, you'll be able to create even more powerful and flexible Serverless functions.
diff --git a/tutorials/sdks/python/get-started/introduction.mdx b/tutorials/sdks/python/get-started/introduction.mdx
index b367fda9..0d73689e 100644
--- a/tutorials/sdks/python/get-started/introduction.mdx
+++ b/tutorials/sdks/python/get-started/introduction.mdx
@@ -3,7 +3,9 @@ title: "Introduction to the Runpod Python SDK"
sidebarTitle: "Introduction"
---
-Welcome to the world of Serverless AI development with the [Runpod Python SDK](https://github.com/runpod/runpod-python).
+import { ServerlessTooltip, WorkerTooltip, HandlerFunctionTooltip, EndpointTooltip } from "/snippets/tooltips.jsx";
+
+Welcome to the world of AI development with the [Runpod Python SDK](https://github.com/runpod/runpod-python).
The Runpod Python SDK helps you develop Serverless AI applications so that you can build and deploy scalable AI solutions efficiently.
@@ -21,13 +23,13 @@ To follow along with this guide, you should have:
The [Runpod Python SDK](https://github.com/runpod/runpod-python) is a toolkit designed to facilitate the creation and deployment of Serverless applications on the Runpod platform.
-It is optimized for AI and machine learning workloads, simplifying the development of scalable, cloud-based AI applications. The SDK allows you to define handler functions, conduct local testing, and utilize GPU support.
+It is optimized for AI and machine learning workloads, simplifying the development of scalable, cloud-based AI applications. The SDK allows you to define s, conduct local testing, and utilize GPU support.
Acting as a bridge between your Python code and Runpod's cloud infrastructure, the SDK enables you to execute complex AI tasks without managing underlying hardware.
To start using Runpod Python SDK, see the [prerequisites](/tutorials/sdks/python/get-started/prerequisites) section or if, you're already setup proceed to the [Hello World](/tutorials/sdks/python/get-started/hello-world) tutorial, where we will guide you through creating, deploying, and running your first Serverless AI application.
-You can also see a library of complete Runpod samples in the [Worker library](https://github.com/runpod-workers) on GitHub. These samples are complete Python libraries for common use cases.
+You can also see a library of complete Runpod samples in the [ library](https://github.com/runpod-workers) on GitHub. These samples are complete Python libraries for common use cases.
## Learn more
diff --git a/tutorials/sdks/python/get-started/prerequisites.mdx b/tutorials/sdks/python/get-started/prerequisites.mdx
index 3fa9b7ff..15f91c9e 100644
--- a/tutorials/sdks/python/get-started/prerequisites.mdx
+++ b/tutorials/sdks/python/get-started/prerequisites.mdx
@@ -2,11 +2,13 @@
title: "Prerequisites"
---
-Setting up a proper development environment is fundamental to effectively building serverless AI applications using Runpod. This guide will take you through each necessary step to prepare your system for Runpod development, ensuring you have the correct tools and configurations.
+import { ServerlessTooltip } from "/snippets/tooltips.jsx";
+
+Setting up a proper development environment is fundamental to effectively building AI applications using Runpod. This guide will take you through each necessary step to prepare your system for Runpod development, ensuring you have the correct tools and configurations.
In this guide, you will learn how to the Runpod library.
-When you're finished, you'll have a fully prepared environment to begin developing your serverless AI applications with Runpod.
+When you're finished, you'll have a fully prepared environment to begin developing your Serverless AI applications with Runpod.
## Prerequisites
@@ -71,7 +73,7 @@ You have now set up and activated a virtual environment for your project. The ne
## Install the Runpod Library
-With the virtual environment activated, you need to install the Runpod Python SDK. This library provides the tools necessary to develop serverless applications on the Runpod platform.
+With the virtual environment activated, you need to install the Runpod Python SDK. This library provides the tools necessary to develop Serverless applications on the Runpod platform.
To install the Runpod library, execute:
@@ -99,6 +101,6 @@ For example:
You have now successfully set up your development environment. Your system is equipped with Python, a virtual environment, and the Runpod library.
-You will use the Runpod Python library for writing your serverless application.
+You will use the Runpod Python library for writing your Serverless application.
Next, we'll proceed with creating a [Hello World application with Runpod](/tutorials/sdks/python/get-started/hello-world).
diff --git a/tutorials/sdks/python/get-started/running-locally.mdx b/tutorials/sdks/python/get-started/running-locally.mdx
index eee05617..3a81e682 100644
--- a/tutorials/sdks/python/get-started/running-locally.mdx
+++ b/tutorials/sdks/python/get-started/running-locally.mdx
@@ -3,21 +3,23 @@ title: "Running code locally"
sidebarTitle: "Running locally"
---
-Before deploying your serverless functions to the cloud, it's crucial to test them locally. In the previous lesson, [Hello World with Runpod](/tutorials/sdks/python/get-started/hello-world), you created a Python file called `hello_world.py`.
+import { ServerlessTooltip, WorkerTooltip } from "/snippets/tooltips.jsx";
-In this guide, you'll learn how to run your Runpod serverless applications on your local machine using the Runpod Python SDK.
+Before deploying your functions to the cloud, it's crucial to test them locally. In the previous lesson, [Hello World with Runpod](/tutorials/sdks/python/get-started/hello-world), you created a Python file called `hello_world.py`.
+
+In this guide, you'll learn how to run your Runpod Serverless applications on your local machine using the Runpod Python SDK.
## Understanding Runpod's Local Testing Environment
When you run your code locally using the Runpod Python SDK, here's what happens behind the scenes:
-* FastAPI Server: The SDK spins up a FastAPI server on your local machine. This server simulates the Runpod serverless environment.
+* FastAPI Server: The SDK spins up a FastAPI server on your local machine. This server simulates the Runpod Serverless environment.
* Request Handling: The FastAPI server receives and processes requests just like the cloud version would, allowing you to test your function's input handling and output generation.
-* Environment Simulation: The local setup mimics key aspects of the Runpod serverless environment, helping ensure your code will behave similarly when deployed.
+* Environment Simulation: The local setup mimics key aspects of the Runpod Serverless environment, helping ensure your code will behave similarly when deployed.
## Running Your Code Locally
-Let's walk through how to run your serverless functions locally using the Runpod Python SDK.
+Let's walk through how to run your Serverless functions locally using the Runpod Python SDK.
**Options for Passing Information to Your API**
@@ -42,7 +44,7 @@ Both methods allow you to simulate how your function would receive data in the a
}
```
-2. Run the serverless function:
+2. Run the Serverless function:
Execute your `hello_world.py` script with the `--rp_server_api` flag:
@@ -80,7 +82,7 @@ INFO | Local testing complete, exiting.
This output provides valuable information:
-* Confirmation that the Serverless Worker started successfully
+* Confirmation that the Serverless started successfully
* Details about the input data being used
* Step-by-step execution of your function
* The final output and job status
@@ -90,8 +92,8 @@ By analyzing this output, you can verify that your function is behaving as expec
### Key Takeaways
* Local testing with the Runpod Python SDK allows you to simulate the cloud environment on your machine.
-* The SDK creates a FastAPI server to mock the serverless function execution.
+* The SDK creates a FastAPI server to mock the Serverless function execution.
* You can provide input data via a JSON file or inline JSON in the command line.
* Local testing accelerates development, reduces costs, and helps catch issues early.
-Next, we'll explore the structure of Runpod handlers in more depth, enabling you to create more sophisticated serverless functions.
+Next, we'll explore the structure of Runpod handlers in more depth, enabling you to create more sophisticated Serverless functions.
diff --git a/tutorials/serverless/comfyui.mdx b/tutorials/serverless/comfyui.mdx
index d84ef6bd..7caff0ce 100644
--- a/tutorials/serverless/comfyui.mdx
+++ b/tutorials/serverless/comfyui.mdx
@@ -4,7 +4,9 @@ sidebarTitle: "Deploy ComfyUI on Serverless"
description: "Learn how to deploy a Serverless endpoint running ComfyUI from the Runpod Hub and use it to generate images with FLUX Dev."
---
-In this tutorial, you will learn how to deploy a Serverless endpoint running [ComfyUI](https://github.com/comfyanonymous/ComfyUI) on Runpod, submit image generation jobs using workflow JSON, monitor their progress, and decode the resulting images.
+import { ServerlessTooltip, EndpointTooltip, RunpodHubTooltip, WorkerTooltip, PodTooltip } from "/snippets/tooltips.jsx";
+
+In this tutorial, you will learn how to deploy a running [ComfyUI](https://github.com/comfyanonymous/ComfyUI) on Runpod, submit image generation jobs using workflow JSON, monitor their progress, and decode the resulting images.
[Runpod's Serverless platform](/serverless/overview) allows you to run AI/ML models in the cloud without managing infrastructure, automatically scaling resources as needed. ComfyUI is a powerful node-based interface for Stable Diffusion that provides fine-grained control over the image generation process through customizable workflows.
@@ -12,7 +14,7 @@ In this tutorial, you will learn how to deploy a Serverless endpoint running [Co
In this tutorial you'll learn:
-- How to deploy a ComfyUI Serverless endpoint using the [Runpod Hub](/hub/overview).
+- How to deploy a ComfyUI Serverless endpoint using the .
- How to structure ComfyUI workflow JSON for API requests.
- How to submit jobs, monitor their progress, and retrieve results.
- How to generate images using the FLUX.1-dev-fp8 model.
@@ -45,7 +47,7 @@ If you want to use a different model, you can also [deploy the endpoint](https:/
Replace `` with the latest release version from GitHub Releases.
-If you need a model that's not listed here, or have your own LoRA, or need custom nodes, you can use this [customization guide](https://github.com/runpod-workers/worker-comfyui/blob/main/docs/customization.md) to create your own custom worker.
+If you need a model that's not listed here, or have your own LoRA, or need custom nodes, you can use this [customization guide](https://github.com/runpod-workers/worker-comfyui/blob/main/docs/customization.md) to create your own custom .
1. Navigate to the [ComfyUI Hub listing](https://console.runpod.io/hub/runpod-workers/worker-comfyui) in the Runpod web interface.
@@ -351,7 +353,9 @@ ComfyUI image successfully saved as 'comfyui_generated_image.png'
Image path: /Users/path/to/your/project/comfyui_generated_image.png
```
+
Congratulations! You've successfully used Runpod's Serverless platform to generate an AI image using ComfyUI with the FLUX.1-dev-fp8 model. You now understand the complete workflow for submitting ComfyUI jobs, monitoring their progress, and retrieving results.
+
## Understanding ComfyUI workflows
@@ -361,7 +365,7 @@ ComfyUI workflows are JSON structures that define the image generation pipeline
- **Class type**: The operation this node performs.
- **Meta information**: Human-readable titles and descriptions.
-You can create custom workflows by modifying node parameters or [opening the ComfyUI interface in a Pod](/tutorials/pods/comfyui) and exporting the workflow to JSON.
+You can create custom workflows by modifying node parameters or [opening the ComfyUI interface in a ](/tutorials/pods/comfyui) and exporting the workflow to JSON.
To learn more about creating your own ComfyUI workflows, see the [ComfyUI documentation](https://docs.comfy.org/development/core-concepts/workflow).
diff --git a/tutorials/serverless/generate-sdxl-turbo.mdx b/tutorials/serverless/generate-sdxl-turbo.mdx
index df1f2349..fdfef95b 100644
--- a/tutorials/serverless/generate-sdxl-turbo.mdx
+++ b/tutorials/serverless/generate-sdxl-turbo.mdx
@@ -5,9 +5,11 @@ description: "Deploy an image generation endpoint from the Hub and integrate it
tag: "NEW"
---
-In this tutorial, you'll deploy a pre-built SDXL Turbo worker from the Runpod Hub and integrate it into a web application. You'll build a simple frontend that sends prompts to your endpoint and displays the generated images.
+import { ServerlessTooltip, EndpointTooltip, RunpodHubTooltip, WorkerTooltip, ColdStartTooltip } from "/snippets/tooltips.jsx";
-By the end, you'll know how to deploy Serverless endpoints from the Hub and integrate them into your applications using standard HTTP requests.
+In this tutorial, you'll deploy a pre-built SDXL Turbo from the and integrate it into a web application. You'll build a simple frontend that sends prompts to your endpoint and displays the generated images.
+
+By the end, you'll know how to deploy endpoints from the Hub and integrate them into your applications using standard HTTP requests.
## What you'll learn
@@ -305,7 +307,7 @@ Then open `http://localhost:8000` in your browser.
Enter a prompt and click **Generate Image** to see your AI-generated image.
-The first request may take longer (30-60 seconds) due to cold start as the endpoint loads the model into GPU memory. Subsequent requests complete in just a few seconds.
+The first request may take longer (30-60 seconds) due to as the endpoint loads the model into GPU memory. Subsequent requests complete in just a few seconds.
## Next steps
diff --git a/tutorials/serverless/model-caching-text.mdx b/tutorials/serverless/model-caching-text.mdx
index 33f3fc8c..62364cfb 100644
--- a/tutorials/serverless/model-caching-text.mdx
+++ b/tutorials/serverless/model-caching-text.mdx
@@ -5,17 +5,19 @@ description: "Learn how to create a custom Serverless endpoint that uses model c
tag: "NEW"
---
+import { ServerlessTooltip, EndpointTooltip, WorkerTooltip, ColdStartTooltip, HandlerFunctionTooltip, CachedModelsTooltip } from "/snippets/tooltips.jsx";
+
You can download the finished code for this tutorial [on GitHub](https://github.com/runpod-workers/model-store-cache-example).
-This tutorial demonstrates how to build a custom Serverless worker that leverages Runpod's [cached model](/serverless/endpoints/model-caching) feature to serve the Phi-3 language model. You'll learn how to create a handler function that locates and loads cached models in offline mode, which can significantly reduce costs and cold start times.
+This tutorial demonstrates how to build a custom that leverages Runpod's feature to serve the Phi-3 language model. You'll learn how to create a handler function that locates and loads cached models in offline mode, which can significantly reduce costs and cold start times.
## What you'll learn
-- How to configure a Serverless endpoint with a cached model.
+- How to configure a Serverless with a cached model.
- How to programmatically locate a cached model in your handler function.
-- How to create a custom handler function for text generation.
+- How to create a custom for text generation.
- How to integrate the Phi-3 model with the Hugging Face Transformers library.
## Requirements
@@ -196,15 +198,24 @@ def resolve_snapshot_path(model_id: str) -> str:
snapshots_dir = os.path.join(model_root, "snapshots")
```
-Cached models use a specific directory structure. A model like `microsoft/Phi-3-mini-4k-instruct` gets stored at:
-
-```
-/runpod-volume/huggingface-cache/hub/models--microsoft--Phi-3-mini-4k-instruct/
-├── refs/
-│ └── main # Contains the commit hash of the "main" branch
-└── snapshots/
- └── abc123def.../ # Actual model files, named by commit hash
-```
+Cached models use a specific directory structure. A model like `microsoft/Phi-3-mini-4k-instruct` gets stored at `/runpod-volume/huggingface-cache/hub/`. For example:
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
The `resolve_snapshot_path()` function navigates this structure to find the actual model files. It first tries to read the `refs/main` file, which contains the commit hash that the "main" branch points to. This is the most reliable method because it matches exactly what Hugging Face would load if you called `from_pretrained()` with network access.
@@ -426,7 +437,9 @@ Expected response:
}
```
+
Congratulations! You've successfully deployed a Serverless endpoint that uses model caching to serve Phi-3.
+
## Benefits of using cached models
diff --git a/tutorials/serverless/run-gemma-7b.mdx b/tutorials/serverless/run-gemma-7b.mdx
index dd775caa..aa7af9f7 100644
--- a/tutorials/serverless/run-gemma-7b.mdx
+++ b/tutorials/serverless/run-gemma-7b.mdx
@@ -5,14 +5,16 @@ description: "Deploy a Serverless endpoint with Google's Gemma 3 model using vLL
tag: "NEW"
---
-This tutorial walks you through deploying a Serverless endpoint with Google's Gemma 3 model using the vLLM worker. You'll deploy the `gemma-3-1b-it` instruction-tuned variant, a lightweight model that runs efficiently on a variety of GPUs.
+import { ServerlessTooltip, EndpointTooltip, VLLMTooltip, WorkerTooltip, ColdStartTooltip } from "/snippets/tooltips.jsx";
+
+This tutorial walks you through deploying a with Google's Gemma 3 model using the vLLM worker. You'll deploy the `gemma-3-1b-it` instruction-tuned variant, a lightweight model that runs efficiently on a variety of GPUs.
By the end, you'll have a fully functional Serverless endpoint that can respond to chat-style prompts through the [OpenAI-compatible API](/serverless/vllm/openai-compatibility).
## What you'll learn
- How to accept Google's terms for gated models on Hugging Face.
-- How to deploy a vLLM worker from the Runpod Hub.
+- How to deploy a worker from the Runpod Hub.
- How to interact with your endpoint using the OpenAI-compatible API.
- How to build a simple command-line chatbot.
@@ -193,7 +195,7 @@ python gemma_chat.py
```
-The first request may take longer (30-60 seconds) due to cold start as the endpoint loads the model into GPU memory. Subsequent requests complete in just a few seconds.
+The first request may take longer (30-60 seconds) due to as the endpoint loads the model into GPU memory. Subsequent requests complete in just a few seconds.
You can now have a conversation with Gemma 3:
@@ -222,4 +224,4 @@ You've successfully deployed Gemma 3 on Runpod Serverless and built a chatbot to
- [Configure your endpoint](/serverless/endpoints/endpoint-configurations) to optimize performance and cost.
- [Learn about vLLM environment variables](/serverless/vllm/environment-variables) to customize model behavior.
- [Explore OpenAI compatibility](/serverless/vllm/openai-compatibility) for features like streaming and function calling.
-- [Build a custom worker](/serverless/workers/custom-worker) for more specialized use cases.
+- Build a custom for more specialized use cases.
diff --git a/tutorials/serverless/run-ollama-inference.mdx b/tutorials/serverless/run-ollama-inference.mdx
index 97c01376..f580cfa5 100644
--- a/tutorials/serverless/run-ollama-inference.mdx
+++ b/tutorials/serverless/run-ollama-inference.mdx
@@ -5,12 +5,14 @@ description: "Learn how to run an Ollama server on Serverless CPU workers."
tag: "NEW"
---
-Run an Ollama server on Serverless CPU workers for LLM inference. This tutorial focuses on CPU compute, but you can also select a GPU for faster performance.
+import { ServerlessTooltip, EndpointTooltip, NetworkVolumeTooltip, WorkersTooltip, ColdStartTooltip } from "/snippets/tooltips.jsx";
+
+Run an Ollama server on CPU for LLM inference. This tutorial focuses on CPU compute, but you can also select a GPU for faster performance.
## What you'll learn
-- Deploy an Ollama container as a Serverless endpoint.
-- Configure a network volume to cache models and reduce cold start times.
+- Deploy an Ollama container as a Serverless .
+- Configure a to cache models and reduce times.
- Send inference requests to your Ollama endpoint.
## Requirements
diff --git a/tutorials/serverless/run-your-first.mdx b/tutorials/serverless/run-your-first.mdx
index 2afb30d4..134fa285 100644
--- a/tutorials/serverless/run-your-first.mdx
+++ b/tutorials/serverless/run-your-first.mdx
@@ -4,7 +4,9 @@ sidebarTitle: "Generate images with SDXL"
description: "Learn how to deploy a Serverless endpoint running SDXL from the Runpod Hub and use it to generate images."
---
-In this tutorial, you will learn how to deploy a Serverless endpoint running [Stable Diffusion XL](https://stablediffusionxl.com/) (SDXL) on Runpod, submit image generation jobs, monitor their progress, and decode the resulting images.
+import { ServerlessTooltip, EndpointTooltip, RunpodHubTooltip, WorkerTooltip } from "/snippets/tooltips.jsx";
+
+In this tutorial, you will learn how to deploy a running [Stable Diffusion XL](https://stablediffusionxl.com/) (SDXL) on Runpod, submit image generation jobs, monitor their progress, and decode the resulting images.
[Runpod's Serverless platform](/serverless/overview) allows you to run AI/ML models in the cloud without managing infrastructure, automatically scaling resources as needed. SDXL is a powerful AI model that generates high-quality images from text prompts.
@@ -12,7 +14,7 @@ In this tutorial, you will learn how to deploy a Serverless endpoint running [St
In this tutorial you'll learn:
-- How to deploy a Serverless endpoint using the [Runpod Hub](/hub/overview).
+- How to deploy a Serverless endpoint using the .
- How to submit jobs, monitor their progress, and retrieve results.
- How to generate an image using SDXL.
- How to decode the base64 output to retrieve the image.
@@ -203,7 +205,9 @@ Image successfully saved as 'generated_image.png'
Image path: /Users/path/to/your/project/generated_image.png
```
+
Congratulations! You've successfully used Runpod's Serverless platform to generate an AI image using SDXL. You now understand the complete workflow of submitting asynchronous jobs, monitoring their progress, and retrieving results.
+
## Next steps
@@ -212,4 +216,4 @@ Now that you've learned how to generate images with Serverless, consider explori
- Learn how to create [synchronous requests](/serverless/endpoints/operations) using the `/runsync` endpoint for faster responses.
- Explore [endpoint configurations](/serverless/endpoints/endpoint-configurations) to optimize performance and cost.
- Discover how to [send requests](/serverless/endpoints/send-requests) with advanced parameters and webhook notifications.
-- Try deploying your own [custom worker](/serverless/workers/custom-worker) for specialized AI models.
+- Try deploying your own [custom worker](/serverless/quickstart) for specialized AI models.