Skip to content

Conversation

@icecrasher321
Copy link
Collaborator

Summary

Locally workflow aborts work. But in Prod/Staging envs the abort signal does not actually propagate correctly. Need to use an executor set / redis to make this work. This PR does it using the standard pattern of redis with in memory fallback we have.

Type of Change

  • Bug fix

Testing

Tested both paths manually

Checklist

  • Code follows project style guidelines
  • Self-reviewed my changes
  • Tests added/updated and passing
  • No new warnings introduced
  • I confirm that I have read and agree to the terms outlined in the Contributor License Agreement (CLA)

@icecrasher321 icecrasher321 requested a review from Sg312 December 24, 2025 09:40
@vercel
Copy link

vercel bot commented Dec 24, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Review Updated (UTC)
docs Skipped Skipped Dec 24, 2025 10:07am

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Dec 24, 2025

Greptile Summary

  • Implements distributed workflow cancellation system using Redis with in-memory fallback to fix cancellation failures in production/staging environments
  • Removes direct executor instance tracking through onExecutorCreated callbacks and replaces with external cancellation API endpoint
  • Adds async cancellation checks in execution engine to query distributed cancellation state at critical execution points

Important Files Changed

Filename Overview
apps/sim/lib/execution/cancellation.ts New module implementing Redis-backed cancellation storage with TTL and in-memory fallback
apps/sim/executor/execution/engine.ts Added async cancellation checks that query distributed state instead of relying on local context
apps/sim/app/api/workflows/[id]/execute/cancel/route.ts New API endpoint for requesting workflow cancellation via external system
apps/sim/executor/execution/executor.ts Added setter to isCancelled property to enable external cancellation mechanisms
apps/sim/app/api/workflows/[id]/execute/route.ts Refactored to remove direct executor tracking and integrate with new cancellation system

Confidence score: 4/5

  • This PR addresses a critical production bug with a well-architected solution that follows established codebase patterns
  • Score reflects good implementation but potential for race conditions and the type safety issue in the cancel route handler
  • Pay close attention to apps/sim/lib/execution/cancellation.ts for Redis fallback logic and apps/sim/app/api/workflows/[id]/execute/cancel/route.ts for the type safety issue

Sequence Diagram

sequenceDiagram
    participant User
    participant "Route Handler" as Route
    participant "Auth System" as Auth
    participant "Preprocessing" as Preprocess
    participant "Execution Core" as Core
    participant "DAG Executor" as DAG
    participant "Execution Engine" as Engine
    participant "Cancellation Service" as Cancel
    participant "Logging Session" as Log

    User->>Route: "POST /api/workflows/[id]/execute"
    Route->>Auth: "checkHybridAuth(req)"
    Auth-->>Route: "auth result"
    Route->>Route: "parse and validate request body"
    Route->>Preprocess: "preprocessExecution()"
    Preprocess-->>Route: "workflow record & actor user"
    Route->>Core: "executeWorkflowCore()"
    Core->>Log: "loggingSession.safeStart()"
    Core->>Core: "load workflow state"
    Core->>DAG: "new Executor()"
    Core->>DAG: "execute(workflowId, triggerBlockId)"
    DAG->>Engine: "run(triggerBlockId)"
    
    loop "While has work"
        Engine->>Cancel: "isCancellationRequested(executionId)"
        Cancel-->>Engine: "cancellation status"
        alt "not cancelled"
            Engine->>Engine: "processQueue()"
            Engine->>Engine: "executeNode()"
        else "cancelled"
            Engine->>Engine: "break execution loop"
        end
    end
    
    Engine-->>DAG: "execution result"
    DAG-->>Core: "execution result"
    Core->>Log: "safeComplete() or safeCompleteWithError()"
    Core-->>Route: "execution result"
    Route-->>User: "JSON response"
Loading

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

8 files reviewed, 2 comments

Edit Code Review Agent Settings | Greptile

Comment on lines +241 to +245
fetch(`/api/workflows/${workflowId}/execute/cancel`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ executionId }),
}).catch(() => {})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: The catch block silently ignores cancellation API errors. Consider logging the error to help with debugging cancellation issues in production.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Prompt To Fix With AI
This is a comment left during a code review.
Path: apps/sim/hooks/use-execution-stream.ts
Line: 241:245

Comment:
**style:** The catch block silently ignores cancellation API errors. Consider logging the error to help with debugging cancellation issues in production.

<sub>Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!</sub>

How can I resolve this? If you propose a fix, please make it concise.

return NextResponse.json({ error: auth.error || 'Unauthorized' }, { status: 401 })
}

let body: any = {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: Avoid using any type. Use unknown instead for better type safety

Suggested change
let body: any = {}
let body: unknown = {}

Context Used: Context from dashboard - TypeScript conventions and type safety (source)

Prompt To Fix With AI
This is a comment left during a code review.
Path: apps/sim/app/api/workflows/[id]/execute/cancel/route.ts
Line: 21:21

Comment:
**style:** Avoid using `any` type. Use `unknown` instead for better type safety

```suggestion
  let body: unknown = {}
```

**Context Used:** Context from `dashboard` - TypeScript conventions and type safety ([source](https://app.greptile.com/review/custom-context?memory=b4f0be8d-a787-4d5a-9098-a66b1449df25))

How can I resolve this? If you propose a fix, please make it concise.


while (this.hasWork()) {
if (this.context.isCancelled && this.executing.size === 0) {
if ((await this.checkCancellation()) && this.executing.size === 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the best place to put this?

private async processQueue(): Promise<void> {
while (this.readyQueue.length > 0) {
if (this.context.isCancelled) {
if (await this.checkCancellation()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check needs to move into redis

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants