Skip to content

Move ReduceOperation function calls generation to SDK and support adaptive replay mode in durable ID computation algo#514

Merged
eabatalov merged 2 commits intomainfrom
eugene/move-reduce-gen-to-sdk
Feb 5, 2026
Merged

Move ReduceOperation function calls generation to SDK and support adaptive replay mode in durable ID computation algo#514
eabatalov merged 2 commits intomainfrom
eugene/move-reduce-gen-to-sdk

Conversation

@eabatalov
Copy link
Contributor

@eabatalov eabatalov commented Feb 4, 2026

Two changes:

Move ReduceOperation function calls generation to SDK

Server currently generates function call chains for ReduceOp. We move this to SDK because it owns generating durable IDs for function calls. The Reduce operations themself get executed exactly the same way after this change. We're just passing a bit more data per reduced item in executor update messages. The extra data is sdk generated function call messages that wrap every reduced item now.

We will remove ReduceOp from protos and reduce op function call generation code from Server after all our users migrate to this SDK version.

Change Durable ID computation algo to support Adaptive replay

Include ID of the previous child function call into the next child function call ID.
This ensures that if a function changes its execution path during
a request replay, then all function calls starting from the beginning
of the changed path get re-executed. This is important to do because
otherwise we might generate the same durable ID for an unchanged child function call
coming after a changed child function call. As a result the later child function call
won't get re-executed which is a bug. We should re-execute everything after a replayed
function call execution path changes. This is especially important because we don't
hash user supplied parameters in function calls. So if user passed a different value obtained
from a different function call into the unchanged function call and we don't re-execute it then
we break the application.

Also use awaitable sequence numbers in scope of each awaitable tree separately. We can now
restart awaitable sequence number from 0 per awaitable tree because we now use previous awaitable id
in durable awaitable IDs. We don't need cross awaitable tree sequence number anymore.

Server currently does this. We move this to SDK because it owns
generating durable IDs for function calls. The Reduce operations
themself get executed exactly the same way after this change.
We're just passing a bit more data per reduced item in executor
update messages. The extra data is sdk generated function call messages
that wrap every reduced item now.
@eabatalov eabatalov force-pushed the eugene/move-reduce-gen-to-sdk branch from 68c7e54 to 4458b36 Compare February 4, 2026 19:39
Include ID of the previous awaitable tree root into durable IDs.
This ensures that if a function changes its execution path during
a request replay, then all function calls starting from the beginning
of the changed path get re-executed. This is important to do because
we don't include function parameter hashes into durable IDs, so we
have to just re-execute everything starting from the changed point
to ensure correctness.

Also use awaitable sequence numbers in scope of each awaitable tree
separately, this is because we can restart awaitable sequence number
from 0 per awaitable tree because we now use previous awaitable id
in durable awaitable IDs.
@eabatalov eabatalov changed the title Move ReduceOperation function calls generation to SDK Move ReduceOperation function calls generation to SDK and support adaptive replay mode in durable ID computation algo Feb 4, 2026
@eabatalov eabatalov merged commit 78d7394 into main Feb 5, 2026
6 checks passed
@eabatalov eabatalov deleted the eugene/move-reduce-gen-to-sdk branch February 5, 2026 11:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants