A different way of executing GEOS from SWELL by Dooruk · Pull Request #643 · GEOS-ESM/swell

Dooruk · 2025-10-17T20:43:33Z

To address #625 I worked on a different way of executing GEOSgcm in SWELL. I had to make a lot of changes but i will try to explain the essential part of this PR.

Let's consider gcm_run.j in 4 stages:

SLURM & node assignment
Preprocessing
Execution
Postprocessing

In the current implementation, SWELL handles 2 & 3 via python and subprocess and 1 is assumed to be set properly by the user, which caused trouble with the NCCS. For DA purposes 4, postprocessing is explicitly handled by SWELL but that is not the focus of this PR.

In this proposed implementation, the main difference is that we rely on gcm_run.j for 2 and 3 by conducting surgical edits via PrepCoupledGeosRundir at few locations and running gcm_run.j directly from Cylc (which doesn't capture failed exit status):

    [[RunGeos]]
        script = "{{experiment_path}}/forecast/gcm_run.j"
        platform = {{platform}}
        [[[job]]]
            shell = /bin/csh
        [[[directives]]]
        {%- for key, value in scheduling["RunGeos"]["directives"]["all"].items() %}
            --{{key}} = {{value}}
        {%- endfor %}

I created the 3dfgat_coupled_cycle suite for testing, should work by default if anyone has time to check it out.

Sorry I'm tagging a lot of people but I would like to get some feedback on this, but not sure if my explanation was sufficient here. Otherwise i can explain it during the next Swell Development Tag-up.

Also it has the idea of model-differentiated tasks (inspired by the Skylab workflow and @mranst has a PR for it already), so perhaps having tasks/model/ might be better way of organizing the model specific tasks like he proposed.

…via Cylc

Copilot

Pull Request Overview

A comprehensive refactoring of SWELL's GEOS execution approach, transitioning from Python subprocess-based execution to direct execution via Cylc with surgical modifications to GEOS configuration files.

Key changes include:

Replacing subprocess-based GEOS execution with direct gcm_run.j script execution from Cylc
Modifying forecast directory structure to be experiment-level rather than cycle-level
Adding new coupled GEOS tasks for improved DA workflow integration

Reviewed Changes

Copilot reviewed 29 out of 29 changed files in this pull request and generated 6 comments.

Show a summary per file

File	Description
src/swell/utilities/slurm.py	Updates SLURM task handling for new RunGeos task
src/swell/utilities/geos.py	Enhanced RC file parsing, checkpoint handling, and error handling
src/swell/tasks/prep_coupled_geos_run_dir.py	New task for preparing coupled GEOS run directories with surgical edits
src/swell/tasks/base/task_base.py	Modified forecast directory structure to be experiment-level
src/swell/suites/3dfgat_coupled_cycle/	New complete coupled cycling suite with direct GEOS execution
Various move/restart tasks	Updated for new directory structure and improved error handling

Comments suppressed due to low confidence (1)

src/swell/tasks/move_erase_da_restart.py:1

Inconsistent destination directory - this line uses 'FORECAST' while other MOM restart operations use 'RESTART'. This appears to be a typo.

# (C) Copyright 2021- United States Government as represented by the Administrator of the

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

Copilot · 2025-10-17T20:45:57Z

src/swell/utilities/question_defaults.py

        ask_question: bool = True
-        prompt: str = ("Since this suite is non-cycling choose how "
-                       "many hours the workflow can run ahead?")
+        prompt: str = ("How many additional cycles can be run"


Missing space before parenthesis in prompt text.

Suggested change

prompt: str = ("How many additional cycles can be run"

prompt: str = ("How many additional cycles can be run "

Copilot · 2025-10-17T20:45:58Z

src/swell/utilities/geos.py

        number_of_states = int(isodate.parse_duration(window_length)
                               / isodate.parse_duration(background_frequency)) + 1
-        self.logger.info('Number of states: ', str(number_of_states-1))
+        self.logger.info('Number of states: ' + str(number_of_states-1))


[nitpick] String concatenation should use f-string formatting for consistency with modern Python practices.

Suggested change

self.logger.info('Number of states: ' + str(number_of_states-1))

self.logger.info(f'Number of states: {number_of_states-1}')

Copilot · 2025-10-17T20:45:58Z

src/swell/tasks/prep_geos_run_dir.py


-        with open(rcfile, "w") as f:
-            yaml.dump(rcdict, f, default_flow_style=False, sort_keys=False)
+        self.geos.write_rc(rcdict, rcfile)


The method now uses self.geos.write_rc() instead of yaml.dump(), but this change may affect the output format. Ensure the new method maintains backward compatibility with existing RC file consumers.

Copilot · 2025-10-17T20:45:58Z

src/swell/tasks/prep_coupled_geos_run_dir.py

+        # be very model specific
+        req_files = ['AGCM.rc', 'CAP.rc', 'data_table', 'diag_table', 'fvcore_layout.rc',
+                     'gcm_emip.setup', 'gcm_run.j', 'HISTORY.rc', '__init__.py', 'input.nml',
+                     'logging.yaml' , 'ice_in', 'MOM_input', 'MOM_override']


Extra space before comma in list item.

Suggested change

'logging.yaml' , 'ice_in', 'MOM_input', 'MOM_override']

'logging.yaml', 'ice_in', 'MOM_input', 'MOM_override']

Copilot · 2025-10-17T20:45:58Z

src/swell/suites/forecast_geos/flow.cylc


    [[RunGeosExecutable]]
-        script = "swell task RunGeosExecutable $config -d $datetime"
+        script = "{{experiment_path}}/forecast/gcm_run.j"


[nitpick] Direct script execution bypasses error handling that would be provided by the SWELL task wrapper. Consider adding error checking or validation mechanisms.

Suggested change

script = "{{experiment_path}}/forecast/gcm_run.j"

script = "swell task RunGeosExecutable $config -d $datetime"

Copilot · 2025-10-17T20:45:59Z

src/swell/deployment/launch_experiment.py


            # Launch the job monitor
-            self.logger.critical('Launching the TUI, press \'q\' at any time to exit the TUI')
+            self.logger.critical('Press any key to launch the TUI. To exit TUI, press \'q\' at any time.')


Missing space after period in the sentence.

DavidSRussell · 2025-11-03T16:16:57Z

Looks like the default 3dfgat_coupled_cycle ran fine for me. The only obvious differences I see are the location of the forecast directory and the time stamp on cycle_done. Not sure how well I understand the pros and cons of the new approach; maybe we can go over it some time?

DavidSRussell · 2025-12-01T15:01:58Z

Just a reminder that we also talked about including time series of the mean observation error, which would be useful for tracking changes in the altimetry processing.

jeromebarre · 2025-12-03T16:55:39Z

Hi, it seems that some commits of the PR address changes that are not part the actual feature proposed.
For example:
7caae0c

would it be possible to do a cleanup (or recreate a new branch with just the correct cherry picked commits) so it's easier to review.

Thanks a lot!

Dooruk · 2025-12-15T20:14:32Z

see #677

Dooruk added 15 commits October 3, 2025 11:00

write out exceptions

dbdd033

minor changes, make TUI message clearer

5ed5f02

changes in moving files between forecast cycles

1ca8ce5

take out the linking task

2cef8cf

introduce simplified coupled tasks assuming direct execution of GEOS …

ad200b2

…via Cylc

add tier 2 to be updated later

53495ef

few changes for geosv12

4484ca3

forecast suite changes

8399270

revamp files, more changes underway

6927ce5

define new coupled model only suite

6a74507

Merge branch 'develop' into origin/feature/exec_geos_direct_develop

04ee7ff

a few fixes

b9112e7

Merge branch 'develop' into origin/feature/exec_geos_direct_develop

909e9af

minor fixes

1610280

Merge branch 'develop' into origin/feature/exec_geos_direct_develop

9e8663b

Dooruk requested review from DavidSRussell, Copilot, jeromebarre, mer-a-o, mranst, rtodling and vrx- and removed request for rtodling October 17, 2025 20:44

Copilot AI reviewed Oct 17, 2025

View reviewed changes

Dooruk changed the title ~~A proposed way of executing GEOS from SWELL~~ A different way of executing GEOS from SWELL Oct 17, 2025

mathomp4 mentioned this pull request Oct 27, 2025

gcm_run.j doesn't exit on fail GEOS-ESM/GEOSgcm_App#774

Open

Dooruk added 2 commits November 19, 2025 10:42

observation yaml changes

7caae0c

some more changes combining obs

74a5fd9

Dooruk mentioned this pull request Dec 8, 2025

Allow for model-differentiated tasks #626

Merged

Dooruk closed this Dec 15, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A different way of executing GEOS from SWELL#643

A different way of executing GEOS from SWELL#643
Dooruk wants to merge 17 commits intodevelopfrom
origin/feature/exec_geos_direct_develop

Dooruk commented Oct 17, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Oct 17, 2025

Uh oh!

Copilot AI Oct 17, 2025

Uh oh!

Copilot AI Oct 17, 2025

Uh oh!

Copilot AI Oct 17, 2025

Uh oh!

Copilot AI Oct 17, 2025

Uh oh!

Copilot AI Oct 17, 2025

Uh oh!

DavidSRussell commented Nov 3, 2025

Uh oh!

DavidSRussell commented Dec 1, 2025

Uh oh!

jeromebarre commented Dec 3, 2025

Uh oh!

Dooruk commented Dec 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Comments

	prompt: str = ("How many additional cycles can be run"
	prompt: str = ("How many additional cycles can be run "

	self.logger.info('Number of states: ' + str(number_of_states-1))
	self.logger.info(f'Number of states: {number_of_states-1}')

	'logging.yaml' , 'ice_in', 'MOM_input', 'MOM_override']
	'logging.yaml', 'ice_in', 'MOM_input', 'MOM_override']

	script = "{{experiment_path}}/forecast/gcm_run.j"
	script = "swell task RunGeosExecutable $config -d $datetime"

Conversation

Dooruk commented Oct 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Oct 17, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 17, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 17, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 17, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 17, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 17, 2025

Choose a reason for hiding this comment

Uh oh!

DavidSRussell commented Nov 3, 2025

Uh oh!

DavidSRussell commented Dec 1, 2025

Uh oh!

jeromebarre commented Dec 3, 2025

Uh oh!

Dooruk commented Dec 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Comments

Dooruk commented Oct 17, 2025 •

edited

Loading