diff --git a/serverless/endpoints/endpoint-configurations.mdx b/serverless/endpoints/endpoint-configurations.mdx index acd0517b..96b3b38e 100644 --- a/serverless/endpoints/endpoint-configurations.mdx +++ b/serverless/endpoints/endpoint-configurations.mdx @@ -62,11 +62,32 @@ The idle timeout determines how long a worker remains active after completing a ### Execution timeout -The execution timeout acts as a failsafe to prevent runaway jobs from consuming infinite resources. It specifies the maximum duration a single job is allowed to run before being forcibly terminated. We strongly recommend keeping this enabled. The default is 600 seconds (10 minutes), and it can be extended up to 7 days. +The execution timeout specifies the maximum duration a single job is allowed to run while actively being processed by a worker. When exceeded, the job is marked as failed and the worker is stopped. We strongly recommend keeping this enabled to prevent runaway jobs from consuming infinite resources. The default is 600 seconds (10 minutes). The minimum is 5 seconds and maximum is 7 days. + +You can configure the execution timeout in the **Advanced** section of your endpoint settings. You can also override this setting on a per-request basis using the `executionTimeout` field in the [job policy](/serverless/endpoints/send-requests#execution-policies). ### Job TTL (time-to-live) -This setting defines how long a job request remains valid in the queue before expiring. If a worker does not pick up the job within this window, the system discards it. The default is 24 hours, and it can be extended up to 7 days. +The TTL defines the total lifespan of a job in the system. Once the TTL expires, the job's data is deleted from the system regardless of its current state—whether it is queued, actively running, or completed. The default is 24 hours. The minimum is 10 seconds and maximum is 7 days. + +The TTL timer starts when the job is submitted, not when execution begins. This means if a job sits in the queue waiting for an available worker, that time counts against the TTL. For example, if you set a TTL of 1 hour and the job waits in queue for 45 minutes, only 15 minutes remain for actual execution. + + +TTL is a hard limit on the job's existence. If the TTL expires while a job is actively running on a worker, the job is immediately removed from the system and subsequent status checks return a 404. This applies even if the job would have completed successfully given more time. Always set TTL to comfortably cover both expected queue time and execution time. + + +You can override this on a per-request basis using the `ttl` field in the [job policy](/serverless/endpoints/send-requests#execution-policies). + +### Result retention + +After a job completes, the system retains the results for a limited time. This retention period is separate from the Job TTL and cannot be extended: + +| Request type | Result retention | Notes | +|--------------|------------------|-------| +| Asynchronous (`/run`) | 30 minutes | Retrieve results via `/status/{job_id}` | +| Synchronous (`/runsync`) | 1 minute | Results returned in the response; also available via `/status/{job_id}` | + +Once the retention period expires, the job data is permanently deleted. ## Performance features diff --git a/serverless/endpoints/send-requests.mdx b/serverless/endpoints/send-requests.mdx index 105e29b7..8c5f3a3e 100644 --- a/serverless/endpoints/send-requests.mdx +++ b/serverless/endpoints/send-requests.mdx @@ -148,16 +148,16 @@ Synchronous jobs wait for completion and return the complete result in a single `/runsync` requests have a maximum payload size of 20 MB. -Results are available for 1 minute by default, but you can append `?wait=x` to the request URL to extend this up to 5 minutes, where `x` is the number of milliseconds to store the results, from 1000 (1 second) to 300000 (5 minutes). +Results are retained for 1 minute after completion. -For example, `?wait=120000` will keep your results available for 2 minutes: +By default, the request waits up to 90 seconds for the job to complete. You can adjust this by appending `?wait=x` to the request URL, where `x` is the number of milliseconds to wait (between 1000 and 300000). For example, `?wait=120000` waits up to 2 minutes for completion: ```sh https://api.runpod.ai/v2/$ENDPOINT_ID/runsync?wait=120000 ``` -`?wait` is only available for `cURL` and standard HTTP request libraries. +The `?wait` parameter controls how long the request waits for job completion, not how long results are retained. Result retention is fixed at 1 minute for sync requests. @@ -1053,14 +1053,63 @@ Policy options: | Option | Description | Default | Constraints | | ------------------ | ------------------------------------------- | ------------------- | ------------------------------ | -| `executionTimeout` | Maximum job runtime in milliseconds | 600000 (10 minutes) | Must be > 5000 ms, max 1 week | +| `executionTimeout` | Maximum time a job can run while being processed by a worker | 600000 (10 minutes) | Min 5 seconds, max 7 days | | `lowPriority` | When true, job won't trigger worker scaling | false | - | -| `ttl` | Maximum job lifetime in milliseconds | 86400000 (24 hours) | Must be ≥ 10000 ms, max 1 week | +| `ttl` | Total lifespan of the job—once expired, the job is deleted regardless of state | 86400000 (24 hours) | Min 10 seconds, max 7 days | Setting `executionTimeout` in a request overrides the default endpoint setting for that specific job only. +#### Understanding TTL vs execution timeout + +The `ttl` and `executionTimeout` settings serve different purposes: + +- **`ttl`**: Total lifespan of the job in the system. The timer starts when the job is submitted and covers queue time, execution time, and everything in between. When TTL expires, the job is deleted regardless of its current state. +- **`executionTimeout`**: Maximum time the job can actively run once a worker picks it up. Only enforced during execution. + + +TTL is a hard limit on the job's existence. If TTL expires while a job is actively running on a worker, the job is immediately removed and subsequent status checks return a 404—even if the job would have completed successfully. The `executionTimeout` does not extend or override the TTL. + + +**Example 1 (queue expiry)**: You set `executionTimeout` to 2 hours and `ttl` to 1 hour. If the job waits in queue for 1 hour, it expires before a worker ever picks it up. The execution timeout never comes into play. + +**Example 2 (mid-execution expiry)**: You set `executionTimeout` to 7 days and `ttl` to 7 days. If the job waits in queue for 1 day, it only has 6 days of TTL remaining for execution. If the job needs the full 7 days to run, it will be deleted on day 7 while still in progress. + +#### Long-running jobs + +For jobs that need to run longer than the default TTL (24 hours): + +1. Set `executionTimeout` to your desired maximum runtime. +2. Set `ttl` to cover **both expected queue time and execution time**. Since TTL is a hard limit on the job's total lifespan, it must be long enough for the job to finish before being deleted. + +```json +{ + "input": { "prompt": "Long running task" }, + "policy": { + "executionTimeout": 172800000, + "ttl": 259200000 + } +} +``` + +In this example, the execution timeout allows up to 48 hours of active runtime, while the TTL gives the job 72 hours of total lifespan. The extra 24 hours of TTL headroom accounts for potential queue wait time. + + +Both `ttl` and `executionTimeout` have a maximum of 7 days. If your job may queue for an extended period, the effective execution window is reduced: a job with a 7-day TTL that queues for 2 days only has 5 days of TTL remaining for execution, even if `executionTimeout` is also set to 7 days. + + +#### Result retention after completion + +After a job completes, results are retained for a fixed period that is separate from the `ttl` setting: + +| Request type | Retention period | +|--------------|------------------| +| `/run` (async) | 30 minutes | +| `/runsync` (sync) | 1 minute | + +These retention periods are fixed and cannot be extended. Once the retention period expires, the job data is permanently deleted. + ### S3-compatible storage integration Configure S3-compatible storage for endpoints working with large files. This configuration is passed directly to your worker but not included in responses.