Skip to content

Use ruamel exclusively#682

Merged
Dooruk merged 15 commits intodevelopfrom
feature/ruamel_eva_r2d2
Jan 29, 2026
Merged

Use ruamel exclusively#682
Dooruk merged 15 commits intodevelopfrom
feature/ruamel_eva_r2d2

Conversation

@Dooruk
Copy link
Collaborator

@Dooruk Dooruk commented Jan 14, 2026

This switches SWELL handling of YAMLs fully with ruamel. To accommodate JEDI executables and EVA and R2D2, I had to use safe mode for loading and rt mode for dumping.

  • This ensures that loaded YAML data consists of standard Python dict and list objects, preventing ConstructorError when passing data to EVA, which still uses PyYAML.
  • Compatibility with JEDI (default_flow_style=False setting), for all tasks that generate JEDI-executable configuration files (run_jedi_*.py), this creates "Block Style" output (one entry per line). JEDI's C++ parsers are sensitive to the "Flow Style" (inline JSON-like) that ruamel.yaml occasionaly uses by default.
  • rt mode preserves the key order
  • In cases where YAML content needs to be manipulated as a string (e.g., for environment variable expansion in create_experiment.py and dictionary.py), I have implemented io.StringIO to capture the YAML output correctly before further processing.

Checks most of the boxes:

  • Suites are running
  • Preserves YAML key order (NOT for EVA)
  • EVA and R2D2 works

Does not resolve the anchor request: #558

Helpful tip: In experiment.yaml, using generate_yaml_and_exit key helps testing different ruamel modes

Some notes for any future user stumbled upon this PR:

  • An exception is done for observing_system_records.py for using "Flow Style" output. It improves readability and doesn't break the code.
  • EVA does not work with rt mode
  • In terms of aliases, gpt suggested using yaml.representer.ignore_aliases = lambda *data: True, which explicitly writes the hook/anchor aliases, rather than assigning id***. Might be an answer for Alternative handling of anchor variables #558
  • Another switch I tested for the safe mode for EVA is yaml.sort_base_mapping_type_on_output = False. This preserves key order but only works for JSON-like outputs. I personally prefer preserving key orders but don't mind about JSON-like outputs as long as EVA doesn't complain. Update: it doesn't work for eva_observations for some reason..

@Dooruk Dooruk linked an issue Jan 14, 2026 that may be closed by this pull request
@Dooruk Dooruk added the core development design related issues and improvements label Jan 15, 2026
mranst
mranst previously approved these changes Jan 15, 2026
Copy link
Collaborator

@mranst mranst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, thanks for working on this. Did you find dumping in safe mode vs round-trip has an effect on key order? My impression was that it shouldn't make a difference, but it shouldn't hurt anything either

@Dooruk
Copy link
Collaborator Author

Dooruk commented Jan 15, 2026

Nice, thanks for working on this. Did you find dumping in safe mode vs round-trip has an effect on key order? My impression was that it shouldn't make a difference, but it shouldn't hurt anything either

Yeah, safe mode operates more like PyYAML and reorders keys alphabetically. This PR way rt mode should help with anchors too going forward perhaps?

@Dooruk Dooruk requested a review from metdyn January 15, 2026 20:46
@Dooruk Dooruk marked this pull request as ready for review January 23, 2026 18:36
@Dooruk
Copy link
Collaborator Author

Dooruk commented Jan 23, 2026

This is ready for a review.

Some notes:

  • EVA does not work with rt mode
  • In terms of aliases, gpt suggested using yaml.representer.ignore_aliases = lambda *data: True, which explicitly writes the hook/anchor aliases, rather than assigning id***. Might be an option for Alternative handling of anchor variables #558
  • Another switch I tested for the safe mode for EVA is yaml.sort_base_mapping_type_on_output = False. This preserves key order but only works for JSON-like outputs (safe mode default). I personally prefer preserving key orders but don't mind about JSON-like outputs as long as EVA doesn't complain. Update: it doesn't work for eva_observations for some reason..
  • An exception is done for observing_system_records.py for using "Flow Style" output. It improves readability and doesn't break the code.
  • Same exception is done for experiment.yaml outputs to preserve comments.

Copy link
Collaborator

@mranst mranst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, and I see tier tests pass

@Dooruk
Copy link
Collaborator Author

Dooruk commented Jan 26, 2026

Looks good, and I see tier tests pass

Thanks, I will give it a few more days to see if there are any more reviews then we can merge.

@Dooruk
Copy link
Collaborator Author

Dooruk commented Jan 29, 2026

@mranst this is good to go in but I want to make sure it is not impacting #656 or #666 much. If it does, it can wait.

@mranst
Copy link
Collaborator

mranst commented Jan 29, 2026

@mranst this is good to go in but I want to make sure it is not impacting #656 or #666 much. If it does, it can wait.

Impact should be minimal, you can go ahead and merge 👍

@Dooruk Dooruk merged commit 133f49a into develop Jan 29, 2026
2 checks passed
@Dooruk Dooruk deleted the feature/ruamel_eva_r2d2 branch January 29, 2026 20:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core development design related issues and improvements ready for merge

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Switch to ruamel for YAML handling

2 participants