Switch to micromamba to fix CircleCI timeout #757

nate-kean · 2025-08-14T20:25:56Z

Description

Fixes two problems that were preventing the CircleCI runners from passing/were causing them to time out:

Problem: A mysterious configuration issue began occurring while trying to run mamba commands. This happened from no change on our end, and appears to be caused by an update to MiniForge mamba.
Solution: Switched to micromamba. This turns out to be the preferred conda/mamba flavor for automated runners because it is extremely simple and downloads and installs very fast. It also does not suffer from the problem that we were experiencing with MiniForge mamba. We continue to use only the conda-forge channel, as micromamba comes preconfigured with just it.
Problem: The "Create RAiDER environment" step was attempting to activate the RAiDER environment before it had been created, through the profile.d script.
Solution: Delayed adding the line to activate the RAiDER environment to after it is created, so the runner only starts to attempt to activate it in the steps after when it has been created.

Motivation and Context

Fixes the problem that has been occurring in our CI builds lately that has caused them all to fail or time out. After this is merged, we will want to re-run the CI tests for recent pull requests that were failing.

How Has This Been Tested?

The ci/circleci checks in this PR are now passing.

Type of change

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)

Checklist:

I have added an explanation of what your changes do and why you'd like us to include them.
I have written new tests for your core changes, as applicable.
I have successfully ran tests with your changes locally.
My change requires a change to the documentation.
I have updated the documentation accordingly.

jlmaurer · 2025-08-19T15:43:49Z

@nate-kean thanks for this!
@jhkennedy @dbekaert do you see any issues with pinning major versions? The idea is to speed up package resolving and prevent breaking changes from coming into RAiDER without being able to address them first.

jhkennedy · 2025-08-19T18:38:14Z

do you see any issues with pinning major versions?

@jlmaurer yes, pinning major version guarantees RAiDER will break in the future, and is generally considered a bad practice, e.g., https://iscinumpy.dev/post/bound-version-constraints

And pinning the development environment here, but not the package dependencies, will mean you won't follow the typical user install path and miss dependency issues that users will hit.

Generally, if you want a known working environment, tools like conda-lock and pixi are designed to give you a locked environment you can use for that. You should still have an unlocked testing workflow so you hit user issues in CI though.

Unfotunately, there isn't a good way to automatically open PRs to update dependencies in conda/pixi lock files like dependabot does, which would be the right way to provide a locked environment and ensure you're seeing dependency issues as they appear, but there is a workaround for pixi:
https://pixi.sh/latest/integration/ci/updates_github_actions/

It's fine short term to add major upper-bounds to the dev environment, but you it should be a very short term. Ideally, in order of importance:

pyproject.toml and, if provided, environment.yml would list dependencies and only exclude known incompatibilities. With core conda dependencies, pixi is the best way to provide this
a lockfile that provides a known working environment
a CI tool that opens PRs to test new dependency and upadate the lock file
CI that regularly tests the minimum bounds of dependencies to make sure check the lower bounds and find new incompatibilities

nate-kean · 2025-08-20T22:01:35Z

Wow, you bring up a lot of good points I didn't consider. You have convinced me, I shouldn't have added these upper bounds where we don't need them. Thanks for your input, @jhkennedy, I will remove the upper bounds I added.

micromamba is optimized for simplicity, and evidently, the configuration error that was causing our builds to fail with mamba does not occur with micromamba. As an added bonus, micromamba downloads and installs about six seconds faster.

Slightly improves stability by not relying on the presence of the environment.yml file or for it to be in a specific format. conda has this functionality built in as part of the `env create` command.

This puts a lower and upper bound on all dependencies that follow the Semver versioning convention. All packages are now bound to their current major version. Exceptions include numpy where we support both v1 and v2, and any other package that already had a specific minor/patch version lower bound. Benefits: - Faster "Create RAiDER environment" step because the number of possibilities the conda/mamba solver has to consider is reduced - Lower RAM use for the same reason. This allows CircleCI to use the Docker/Medium resource class instead of Docker/Large more often. - Improved long-term stability for RAiDER: capping the major version means RAiDER won't upgrade to an untested dependency version that it turns out to be incompatible with. Capping the minor version keeps RAiDER from ever installing a package version so old that it is incompatible. Semver definition: https://semver.org/

This reverts commit 7435ff5. Reapply "Optimization and stability: Add version bounds" This reverts commit 1b8fa95. Remove new upper bounds, keep new lower bounds

jlmaurer · 2025-08-22T12:51:25Z

@nate-kean I think you can rebase on dev and hopefully that last test will pass. I think #759 and/or #760 updated the test_GUNW test.

nate-kean · 2025-08-22T15:05:36Z

Yes, I have done that. But now the runners are taking over an hour again. I'm starting to doubt whether my changes made any difference...

nate-kean · 2025-08-22T15:15:35Z

New upper bounds removed. I also had to make a couple fixes to tests to make all the tests pass. The tests in question even failed for me on dev, so I suspect it may have been upstream changes in dependencies which only started applying to me after I had reinstalled/updated my environment for this PR.

But now the CircleCI runners are taking over an hour to solve their environment again sometimes. It's making me wonder if these bouts of slowdowns are even being caused by us, especially since they have historically started after changes that shouldn't affect environment creation at all, and "solved" themselves after just trying again in a week. I might split these test fixes into their own pull request and close this one, because it doesn't appear to have really fixed the issue.

I am considering a new approach now: add a lock file (conda/conda-lock) to the project, and the dependency solutions for each Python version we test will be computed ahead of time and no longer have to be done by the CircleCI runners.

nate-kean · 2025-08-23T21:40:57Z

Closing; moved progress to #762

nate-kean added a commit to nate-kean/RAiDER that referenced this pull request Aug 14, 2025

Update CHANGELOG.md for dbekaert#757

d35a341

nate-kean added a commit to nate-kean/RAiDER that referenced this pull request Aug 14, 2025

Update CHANGELOG.md for dbekaert#757

3806194

nate-kean force-pushed the fix-circle-ci-timeout branch from d35a341 to 3806194 Compare August 14, 2025 20:42

nate-kean marked this pull request as ready for review August 14, 2025 21:24

nate-kean and others added 14 commits August 21, 2025 22:03

Fix: Do not attempt to activate env before its creation

11213b1

Switch from MiniForge mamba to micromamba

dd6d016

micromamba is optimized for simplicity, and evidently, the configuration error that was causing our builds to fail with mamba does not occur with micromamba. As an added bonus, micromamba downloads and installs about six seconds faster.

Add comment

35c7015

Nitpick: Use built-in method for pinning env to specific Python ver

3e59a5c

Slightly improves stability by not relying on the presence of the environment.yml file or for it to be in a specific format. conda has this functionality built in as part of the `env create` command.

Update CHANGELOG.md for dbekaert#757

79b79cd

Revert "Optimization and stability: Add version bounds"

326c28f

This reverts commit 7435ff5. Reapply "Optimization and stability: Add version bounds" This reverts commit 1b8fa95. Remove new upper bounds, keep new lower bounds

Fix test_temporal_interpolate.py: use xarray for MERRA2._fetch()

902c5c7

Fix test_GUNW_hyp3_metadata_update: fix invalid gunw_schema.json

4d269ee

add in get_tag_set to aws.py

007bd25

switch to accept only Path

5548801

fix path in test

41c4a41

fix patch side effect

ac1ef48

fix 135

2a30200

nate-kean force-pushed the fix-circle-ci-timeout branch from 1665a91 to 2a30200 Compare August 22, 2025 03:05

Merge branch 'dev' into fix-circle-ci-timeout

e3dfb7d

This was referenced Aug 22, 2025

Add Python 3.13 compatibility #761

Merged

Add lockfiles to RAiDER #762

Merged

nate-kean closed this Aug 23, 2025

nate-kean deleted the fix-circle-ci-timeout branch August 24, 2025 02:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Switch to micromamba to fix CircleCI timeout #757

Switch to micromamba to fix CircleCI timeout #757

Uh oh!

nate-kean commented Aug 14, 2025 •

edited

Loading

Uh oh!

jlmaurer commented Aug 19, 2025

Uh oh!

jhkennedy commented Aug 19, 2025 •

edited

Loading

Uh oh!

nate-kean commented Aug 20, 2025

Uh oh!

jlmaurer commented Aug 22, 2025

Uh oh!

nate-kean commented Aug 22, 2025

Uh oh!

nate-kean commented Aug 22, 2025 •

edited

Loading

Uh oh!

nate-kean commented Aug 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Switch to micromamba to fix CircleCI timeout #757

Switch to micromamba to fix CircleCI timeout #757

Uh oh!

Conversation

nate-kean commented Aug 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Motivation and Context

How Has This Been Tested?

Type of change

Checklist:

Uh oh!

jlmaurer commented Aug 19, 2025

Uh oh!

jhkennedy commented Aug 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nate-kean commented Aug 20, 2025

Uh oh!

jlmaurer commented Aug 22, 2025

Uh oh!

nate-kean commented Aug 22, 2025

Uh oh!

nate-kean commented Aug 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nate-kean commented Aug 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

nate-kean commented Aug 14, 2025 •

edited

Loading

jhkennedy commented Aug 19, 2025 •

edited

Loading

nate-kean commented Aug 22, 2025 •

edited

Loading