Skip to content

Updated workflow templating#656

Draft
mranst wants to merge 216 commits intodevelopfrom
se/mranst/workflow_templating_redux2
Draft

Updated workflow templating#656
mranst wants to merge 216 commits intodevelopfrom
se/mranst/workflow_templating_redux2

Conversation

@mranst
Copy link
Collaborator

@mranst mranst commented Nov 10, 2025

This is the most recent version of the concept of updating the workflow templating to use python. This approach should be more straightforward and easier to understand than the previous version, as it uses the same jinja templating in the graph section to make it easier to read (though it doesn't necessarily have to). This pr also merges the task_runtimes and task_questions objects used by the previous pr into one object for simplicity. I'm aiming for this approach to allow for greater independence of the tasks from the suite creation system, to allow generating configs for single tasks and having mock values for code tests for each task, but that's not ready to show off yet.

I still need to merge develop into this and update the docs, but I wanted to open this up as a draft

@mranst
Copy link
Collaborator Author

mranst commented Jan 20, 2026

Setup declarations have been put in the main task files. I went with method 3 as that seemed to work best. Method 4 seemed to register the inherited classes under each instance that TaskSetup was imported under, so it didn't work to centrally track them. Method 5 worked but there wasn't a great way to robustly handle the naming of the classes without initializing them. I moved the external imports (eva, r2d2, jedi_bundle) into the execute methods of the tasks that need them, but there may be a better way of handling this

Copy link
Collaborator

@shiklomanov-an shiklomanov-an left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nicely done! I really like how this is looking. I think having the task parameters in the same files as the task definitions will be very useful.

I flagged only a few (hopefully minor) things.

  • Some of these are type annotation issues flagged by my type checker.
  • There's also what looks like a bug related to override as a file path vs. a dictionary.
  • Finally, there are a few documentation requests for implementation details and design decisions, to help future developers (since we're using slightly esoteric Python machinery to get this working).

(Re: Local import documentation --- if we're just using local imports for consistency across modules, that's totally acceptable and you can disregard my comments about those...but maybe add a line to the documentation somewhere about why we're using that general pattern, with reference to specific cases like eva and r2d2).

@Dooruk @rtodling -- Take a look the new task definition structure (any of the files under tasks) and let me know what you think. Don't worry too much about the underlying machinery; I'm more interested on what you think of this overall structure (compared to having all the task questions sit in a single giant YAML file / Python script like we had before; #314).

@Dooruk Dooruk mentioned this pull request Jan 29, 2026
@Dooruk
Copy link
Collaborator

Dooruk commented Feb 5, 2026

Wow, this is some impressive amount of change. Really appreciate the documentation, it would be quite confusing without it.

When I first started using SWELL, I was able to include config variables inside the task scripts without the task_questions.yaml and suite_questions.yaml interference. Explaining that concept to the new users has always been challenging. Overall, I'm happy with the vision and design decisions here. Previous method made it hard to introduce new developers, hope this will simplify the SWELL entry point.

Overall, I'm happy with these changes. In the previous workflow templating iteration flow.cylc handling was really confusing, and this approach makes it more straightforward.

Abandoning single task_questions and suite_questions is a welcome change. Users would also see QuestionDefaults class is coming from utilities so that multi-file ambiguity is inherently handled.

Since we have a limited number of seasoned users, I would be curious to hear from Yonggang and Maryam too, once this is ready for that stage.

Some minor questions/details/nitpicks:

  1. Is this ready for testing yet?

  2. How can workflow.py handle non \tasks specified scripts? Is this PR limiting us with python-only scripts such as:

  • (More of a general comment but going forward) Could this new method handle, say, bash scripts?
  • Or calling outside scripts directly, like so:
    [[RunGeos]]
        script = "{{experiment_path}}/GEOSgcm/forecast/gcm_run.j"
        platform = {{platform}}
        [[[job]]]
            shell = /bin/csh
  • Model specific tasks organized under \tasks\model, following Michael's PR.
  1. Documentation states that [scheduling] section of flow.cylc remains the same, that means that part is still meant to handle conditional tasks and loops since jinja templating parses it?

  2. What are "messaging parameters"?

  3. I'm going to sound very pedantic and annoying, but is_model switch might be better named model_specific or model_dep? Considering how widespread this switch is I can live without this change :) In a similar vein, time_limit -> task_time_limit.

@mranst
Copy link
Collaborator Author

mranst commented Feb 5, 2026

@Dooruk Thanks! To answer your questions:

  1. Yes, this should be ready for testing. It's been a week or two but this PR has passed tier1 and tier2 previously
  2. The runtime section generation is geared to swell tasks for convenience, but you can easily run any kind of script by overriding the TaskSetup object. Your example would look like this: (you can just put this in the workflow.py file and reference it there, it doesn't need to be it's own task)
class GcmRunJ(TaskSetup):
    def set_attributes:
        self.base_name = "RunGeos"
        self.script = "{{experiment_path}}/GEOSgcm/forecast/gcm_run.j"
        self.slurm = {}
        self.additional_sections.append(
             self.create_new_section('job', {'shell': 'csh'})
        )

class Workflow:
...

    def set_tasks(self) -> list:
        self.tasks.append(GcmRunJ)

For development purposes, there's also no reason you couldn't just add the section in plain text to the scheduling

# Task defaults
# -------------
''' # noqa
# --------------------------------------------------------------------------------------------------

^ right here, for instance.

  1. Basically nothing is different in the scheduling section, cylc still handles it in the exact same way once flow.cylc is generated.

In all workflows I maintained that the scheduling section is templated using jinja, like so:

def get_workflow_string(self):
workflow_str = self.default_header()
workflow_str += template_string_jinja2(logger=self.logger,
templated_string=template_str,
dictionary_of_templates=self.experiment_dict,
allow_unresolved=True)
for task in self.tasks:
workflow_str += task.runtime_string(self.experiment_dict,
self.slurm_external)
return workflow_str

So that section of file construction is still handled in the same jinja logic. But technically, you can set the set_workfow_string method in any way you want, so there's room to get more advanced here if we wish

  1. Messaging parameters are the events for which cylc will email the user if the suite is launched with the -m flag. The relevant ones are failed and submit-failed. The ones that are designed to fail normally only have submit-failed as a option to not annoy the user. Suites can have succeeded which might be relevant as well.

  2. Good call, I'll rename them

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants