Skip to content

Conversation

@icecrasher321
Copy link
Collaborator

@icecrasher321 icecrasher321 commented Dec 23, 2025

Summary

State machine of workflow execution made explicit. No more derived states during execution or persisting logs. Last block executing when cancelled continues to finish executing retaining state as running until it's done before transitioning to cancelled.

Type of Change

  • Other: Code Improvement

Testing

Tested manually.

Checklist

  • Code follows project style guidelines
  • Self-reviewed my changes
  • Tests added/updated and passing
  • No new warnings introduced
  • I confirm that I have read and agree to the terms outlined in the Contributor License Agreement (CLA)

@vercel
Copy link

vercel bot commented Dec 23, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Review Updated (UTC)
docs Skipped Skipped Dec 24, 2025 2:21am

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Dec 23, 2025

Greptile Summary

Refactored workflow execution state management from derived states to an explicit state machine with a dedicated status column tracking execution lifecycle (running → completed/failed/cancelled/pending). The change eliminates ambiguity in determining execution state and properly handles cancellation scenarios.

Key Changes:

  • Added status column to workflow_execution_logs table to explicitly track execution state instead of deriving it from level and endedAt
  • Implemented completeWithCancellation() in LoggingSession to handle cancelled executions distinctly from failures
  • Added completed flag to LoggingSession to prevent duplicate completion calls (idempotency)
  • Modified ExecutionEngine to check isCancelled flag and return status='cancelled' when cancellation is detected
  • Enhanced cancellation flow to allow currently executing blocks to finish before transitioning to cancelled state
  • Fixed race condition in chat execution cancellation by tracking active execution ID via currentChatExecutionIdRef
  • Replaced AbortController with stream reader cancellation for better control of streaming cleanup
  • Updated human-in-the-loop manager to maintain status as pending while paused and running during resume
  • Improved error handling in background execution jobs with proper logging session completion

Migration:

  • Backfills existing logs: level='error'status='failed', endedAt IS NOT NULLstatus='completed', otherwise status='running'

The refactoring improves clarity and correctness of execution state tracking throughout the system.

Confidence Score: 4/5

  • This PR is safe to merge with minor considerations for testing edge cases
  • The state machine refactoring is well-designed with proper migration and idempotency guards. The explicit status tracking eliminates ambiguity and the cancellation flow correctly allows in-flight blocks to complete. One concern is the completed flag in LoggingSession which could potentially cause issues if the session object is reused, though this appears unlikely given the usage patterns. The race condition fix in chat execution is solid. All changes maintain backward compatibility through the migration.
  • Pay close attention to apps/sim/lib/logs/execution/logging-session.ts to verify the completed flag behavior in all execution paths

Important Files Changed

Filename Overview
packages/db/migrations/0132_dazzling_leech.sql Added status column to track explicit execution state (running/completed/failed/cancelled), backfilled with correct values based on existing data
apps/sim/lib/logs/execution/logging-session.ts Added explicit state machine with completeWithCancellation() method and completed flag to prevent duplicate completions
apps/sim/executor/execution/engine.ts Added cancellation checks in execution loop, returns 'cancelled' status when isCancelled flag is set
apps/sim/lib/workflows/executor/execution-core.ts Routes execution to appropriate logging completion method based on status (cancelled/paused/completed)
apps/sim/app/workspace/[workspaceId]/w/[workflowId]/hooks/use-workflow-execution.ts Fixed race condition in chat execution cancellation by tracking active execution ID and preventing stale cleanup operations
apps/sim/lib/workflows/executor/human-in-the-loop-manager.ts Updates execution log status to pending/running/failed as pause points are registered, resumed, or fail

Sequence Diagram

sequenceDiagram
    participant User
    participant UI as UI Component
    participant Hook as useWorkflowExecution
    participant Core as execution-core.ts
    participant Engine as ExecutionEngine
    participant Session as LoggingSession
    participant DB as Database

    User->>UI: Clicks Run or Cancel
    
    alt Start Execution
        UI->>Hook: handleRunWorkflow()
        Hook->>Session: safeStart()
        Session->>DB: INSERT with status='running'
        Hook->>Core: executeWorkflowCore()
        Core->>Engine: execute()
        
        loop While hasWork()
            Engine->>Engine: Check isCancelled flag
            alt Not Cancelled
                Engine->>Engine: processQueue()
                Engine->>Engine: executeNodeAsync()
            else Cancelled and executing.size === 0
                Engine-->>Engine: Break loop
            end
        end
        
        Engine-->>Core: ExecutionResult with status
        
        alt Status: cancelled
            Core->>Session: safeCompleteWithCancellation()
            Session->>DB: UPDATE status='cancelled'
        else Status: paused
            Core->>Session: (Skip completion, keep running)
            Session->>DB: UPDATE status='pending'
        else Status: completed/failed
            Core->>Session: safeComplete/safeCompleteWithError()
            Session->>DB: UPDATE status='completed/failed'
        end
        
        Core-->>Hook: result
        Hook-->>UI: Update execution state
    else Cancel Execution
        UI->>Hook: handleCancelExecution()
        Hook->>Hook: Set context.isCancelled = true
        Hook->>Hook: Cancel stream reader
        Hook->>Hook: Reset execution state
        Note over Engine: Currently executing block<br/>continues to finish
        Engine->>Engine: Check isCancelled on next iteration
        Engine-->>Core: status='cancelled'
        Core->>Session: completeWithCancellation()
        Session->>DB: UPDATE status='cancelled'
    end
    
    alt Resume from Pause
        User->>UI: Provides input for paused execution
        UI->>Hook: Resume execution
        Hook->>DB: UPDATE status='running'
        Hook->>Core: executeWorkflowCore (resume mode)
        Core->>Engine: execute()
        Note over Engine: Continues from snapshot
        Engine-->>Core: ExecutionResult
        Core->>Session: Update parent execution log
        Session->>DB: UPDATE status based on result
    end
Loading

@icecrasher321
Copy link
Collaborator Author

@greptile

@icecrasher321 icecrasher321 merged commit 8c89507 into staging Dec 24, 2025
11 checks passed
@waleedlatif1 waleedlatif1 deleted the fix/logging-state-machine branch December 24, 2025 02:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants