Skip to content

Comments

Implement persistent shells for improved performance#960

Draft
GlassOfWhiskey wants to merge 1 commit intomasterfrom
persistent-shell
Draft

Implement persistent shells for improved performance#960
GlassOfWhiskey wants to merge 1 commit intomasterfrom
persistent-shell

Conversation

@GlassOfWhiskey
Copy link
Member

@GlassOfWhiskey GlassOfWhiskey commented Feb 20, 2026

This commit introduces a persistent shell architecture that significantly speeds up remote command execution in StreamFlow by reusing shell sessions instead of creating new processes for each command. Key features are:

  • Core Shell Architecture:

    • Add Shell abstract base class in core/deployment.py defining the interface for persistent shell sessions
    • Implement BaseShell in deployment/shell.py with command execution, output capture, and lifecycle management
    • Create specialized shell implementations for different connector types (Base, SSH, Docker, Kubernetes)
  • Connector Integration:

    • Add the get_shell() method to the Connector interface for obtaining persistent shell instances
    • Implement shell caching and reuse in BaseConnector with thread-safe access via locks
    • Update run() methods across connectors to automatically use persistent shells when possible
    • Add graceful fallback to direct execution if shell operations fail
  • Connector-Specific Implementations:

    • SubprocessShell: Local shell execution with asyncio subprocess pipes
    • SSHShell: Remote shell over SSH connections using asyncssh
    • KubernetesShell: Shell execution in pods via WebSocket streams
    • Docker/Singularity: Shell execution in containers via docker exec and singularity exec wrappers
  • Performance Optimizations:

    • Reuse shell sessions across multiple commands to reduce connection overhead
    • Implement buffered I/O with configurable buffer sizes
    • Use unique end markers to reliably detect command completion
  • Remote Path Improvements:

    • Optimize glob() implementation in RemoteStreamFlowPath to reduce command overhead
    • Fix shell quoting issues in rmtree() command

This commit also resolves a race condition caused by concurrent updates of file tokens in the CWLTokenProcessor class , and fixes race conditions with stream close logic when transferring files through tar streams.

@codecov
Copy link

codecov bot commented Feb 20, 2026

❌ 2 Tests Failed:

Tests completed Failed Passed Skipped
1233 2 1231 8
View the top 2 failed test(s) by shortest run time
cwl-v1.3-e61e6e2c37032ac6675b3ef8664f60c93bbc810a/conformance_tests.cwltest.yaml::conformance_tests::cwltest::yaml::filesarray_secondaryfiles
Stack Traces | 4.19s run time
CWL test execution failed. 
Returned non-zero but it should be zero
Test: job: 
  file:.../streamflow/streamflow/cwl-v1.3-e61e6e2c37032ac6675b3ef8664f60c93bbc810a/tests/docker-array-secondaryfiles-job.json
output:
  bai_list:
    checksum: sha1$081fc0e57d6efa5f75eeb237aab1d04031132be6
    location: fai.list
    class: File
    size: 386
tool: 
  file:.../streamflow/streamflow/cwl-v1.3-e61e6e2c37032ac6675b3ef8664f60c93bbc810a/tests/docker-array-secondaryfiles.cwl
id: 
  file:.../streamflow/streamflow/cwl-v1.3-e61e6e2c37032ac6675b3ef8664f60c93bbc810a/conformance_tests.cwltest.yaml#filesarray_secondaryfiles
doc: Test required, optional and null secondaryFiles on array of files.
tags:
- docker
- inline_javascript
- command_line_tool
line: '1320'
cwl-v1.3-e61e6e2c37032ac6675b3ef8664f60c93bbc810a/conformance_tests.cwltest.yaml::conformance_tests::cwltest::yaml::initial_workdir_secondary_files_expr
Stack Traces | 6.62s run time
CWL test execution failed. 
Returned non-zero but it should be zero
Test: job: 
  file:.../streamflow/streamflow/cwl-v1.3-e61e6e2c37032ac6675b3ef8664f60c93bbc810a/tests/search-job.json
output:
  outfile:
    class: File
    checksum: sha1$e2dc9daaef945ac15f01c238ed2f1660f60909a0
    location: result.txt
    size: 142
  indexedfile:
    location: input.txt
    class: File
    checksum: sha1$327fc7aedf4f6b69a42a7c8b808dc5a7aff61376
    secondaryFiles:
    - location: input.txt.idx1
      class: File
      checksum: sha1$1f6fe811644355974cdd06d9eb695d6e859f3b44
      size: 1500
    - location: input.idx2
      class: File
      checksum: sha1$da39a3ee5e6b4b0d3255bfef95601890afd80709
      size: 0
    - location: input.txt.idx3
      class: File
      checksum: sha1$da39a3ee5e6b4b0d3255bfef95601890afd80709
      size: 0
    - location: input.txt.idx4
      class: File
      checksum: sha1$da39a3ee5e6b4b0d3255bfef95601890afd80709
      size: 0
    - location: input.txt.idx5
      class: File
      checksum: sha1$da39a3ee5e6b4b0d3255bfef95601890afd80709
      size: 0
    - location: input.idx6.txt
      class: File
      checksum: sha1$da39a3ee5e6b4b0d3255bfef95601890afd80709
      size: 0
    - checksum: sha1$da39a3ee5e6b4b0d3255bfef95601890afd80709
      class: File
      location: input.txt.idx7
      size: 0
    - checksum: sha1$47a013e660d408619d894b20806b1d5086aab03b
      class: File
      location: hello.txt
      size: 13
    - class: Directory
      listing:
      - basename: index
        checksum: sha1$da39a3ee5e6b4b0d3255bfef95601890afd80709
        class: File
        location: index
        size: 0
      location: input.txt_idx8
    size: 1111
tool: 
  file:.../streamflow/streamflow/cwl-v1.3-e61e6e2c37032ac6675b3ef8664f60c93bbc810a/tests/search.cwl#main
id: 
  file:.../streamflow/streamflow/cwl-v1.3-e61e6e2c37032ac6675b3ef8664f60c93bbc810a/conformance_tests.cwltest.yaml#initial_workdir_secondary_files_expr
doc: Test InitialWorkDirRequirement linking input files and capturing 
  secondaryFiles on input and output. Also tests the use of a variety of 
  parameter references and expressions in the secondaryFiles field.
tags:
- initial_work_dir
- inline_javascript
- workflow
line: '512'

To view more test analytics, go to the Test Analytics Dashboard
📋 Got 3 mins? Take this short survey to help us improve Test Analytics.

@GlassOfWhiskey GlassOfWhiskey force-pushed the persistent-shell branch 2 times, most recently from ca1f57a to 5f1d8a6 Compare February 20, 2026 17:24
@GlassOfWhiskey GlassOfWhiskey force-pushed the persistent-shell branch 6 times, most recently from 2fe08b9 to d122de8 Compare February 20, 2026 19:52
@GlassOfWhiskey GlassOfWhiskey changed the title Added persistent shell Implement persistent shells for improved performance Feb 20, 2026
@GlassOfWhiskey GlassOfWhiskey force-pushed the persistent-shell branch 19 times, most recently from 5e098c3 to ffbd7e7 Compare February 21, 2026 08:43
@GlassOfWhiskey GlassOfWhiskey force-pushed the persistent-shell branch 3 times, most recently from f20ace3 to 8810c6b Compare February 21, 2026 09:01
@GlassOfWhiskey GlassOfWhiskey force-pushed the persistent-shell branch 19 times, most recently from d8301c5 to cdb8ab0 Compare February 21, 2026 15:10
This commit introduces a persistent shell architecture that
significantly speeds up remote command execution in StreamFlow
by reusing shell sessions instead of creating new processes for
each command. Key features are:

- Core Shell Architecture:
  - Add `Shell` abstract base class in `core/deployment.py`
    defining the interface for persistent shell sessions
  - Implement `BaseShell` in `deployment/shell.py` with command
    execution, output capture, and lifecycle management
  - Create specialized shell implementations for different
    connector types (Base, SSH, Docker, Kubernetes)

- Connector Integration:
  - Add the `get_shell()` method to the `Connector` interface for
    obtaining persistent shell instances
  - Implement shell caching and reuse in `BaseConnector` with
    thread-safe access via locks
  - Update `run()` methods across connectors to automatically
    use persistent shells when possible
  - Add graceful fallback to direct execution if shell
    operations fail

- Connector-Specific Implementations:
  - `SubprocessShell`: Local shell execution with asyncio
    subprocess pipes
  - `SSHShell`: Remote shell over SSH connections using
    `asyncssh`
  - `KubernetesShell`: Shell execution in pods via WebSocket
    streams
  - `Docker/Singularity`: Shell execution in containers via
    `docker exec` and `singularity exec` wrappers

- Performance Optimizations:
  - Reuse shell sessions across multiple commands to reduce
    connection overhead
  - Implement buffered I/O with configurable buffer sizes
  - Use unique end markers to reliably detect command
    completion

- Remote Path Improvements:
  - Optimize `glob()` implementation in `RemoteStreamFlowPath`
    to reduce command overhead
  - Fix shell quoting issues in `rmtree()` command

This commit also resolves a race condition caused by concurrent
updates of file tokens in the `CWLTokenProcessor` class, and
fixes race conditions with stream `close` logic when transferring
files through `tar` streams.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant