[fix] Lazy access for PSyGrid initial/final values to save RAM #778

sgossage · 2025-12-16T03:13:57Z

Testing a potential solution to reduce some of the RAM used during population synthesis.

Right now we load entire data array for the initial and final values into RAM. About 5 GB of memory is allocated in total. If we instead use lazy access (storing the memory reference and not the entire array), we can save some RAM. With lazy access, we instead allocate only ~1.5 GB in total. Looking into further optimizations and still testing this one.

Also looking into RAM usage of binary evolution, although @maxbriel has looked into this quite a bit already (see Issue#224).

sgossage · 2025-12-16T03:15:40Z

This is an example of the memory usage over time of a binary population run (BinaryPopulation.evolve() with 10 binaries) with the current main release (v2.2.2).

The initial spike is when we load our PSyGrids, then the gradual accumulation after that is from binary evolution. The peak data consumption is about 6 GB.

sgossage · 2025-12-16T03:16:44Z

This is with lazy loading of the PSyGrids:

The peak data consumption is about 3 GB, reducing total RAM usage by 50%. The RAM usage due to grid/step loading is reduced by ~70%. This does not seem to affect runtime, but still checking. This brings RAM usage back down to around what was shown e.g., in Issue#224.

The change made here is in PSyGrid, mainly rewriting:

        self.initial_values = hdf5['/grid/initial_values'][()]
        self.final_values = hdf5['/grid/final_values'][()]

as

        self.initial_values = hdf5['/grid/initial_values']
        self.final_values = hdf5['/grid/final_values']

…hens run time due to I/O overhead, so not worth it yet

for more information, see https://pre-commit.ci

…nt lengthens run time due to I/O overhead, so not worth it yet" This reverts commit 53520fc.

…POSYDON into sg_fix_grid_data_ram_usage

sgossage · 2025-12-17T03:38:19Z

Every time step_detached is entered, the root matrices are recalculated and we also train any scalers that don't exist. We can precalculate these objects so that this doesn't happen during binary evolution. This is not a huge issue, but something to consider.

Also, we create a TrackMatcher object for every instance of step_detached (we have 5 for step_detached itself, step_dco, step_merged, etc.). Instead, we can create a single TrackMatcher object and reuse it. We also create one for step_CE.

I'm testing making TrackMatcher a simulation property with the idea being that it could be reused globally by steps that need it. There's some issue with e.g., step_CE using a different matching list than all the others which needs to be sorted out. Here is a quick look at performance with the above implemented:

example_with_preloaded_training_and_root0calc

We get the large increase in RAM at the start now when we create the various objects the steps need. Then during evolution we get a gradual rise mostly from the binaries being added to RAM. We save a bit on runtime (mostly on the step loading part now) by preloading and reusing these objects -- e.g., creating just one TrackMatcher and its associated GRIDInterpolator objects instead of doing that each time for every instance of step_detached.

For 1000 binaries:

example_with_preloaded_training_and_root0calc_1000bin

and this is the default (current main v2.2.2) behavior w/ 1000 for comparison:

…kMatcher. Also trying a TrackMatcher that is accessible by all evolution steps. Not done correctly yet, just for testing.

for more information, see https://pre-commit.ci

maxbriel · 2025-12-18T13:52:20Z

Great work here! Identifying the culprits is not easy.

I really like the TrackMatcher being limited to 1. This to me seems to be a major culprit for the higher memory usage, since every single-star model, after its read in once, is stored in memory. This would increase the load significantly with one TrackMatch.

For the initial and final values, I am a bit worried about the I/O component. For detached evolution, reading in a single star model from disk is the slowest part on the HPC facility. I am worried that switching to the memory reference will make this a larger issue.

I would pose more general questions:

Do we care about speed or about memory usage?
If we read in everything from disk, that will be our bottleneck; if we keep things in memory then that will be.
We should figure out if we want this gradual read-in of the single-star models. While it speeds up processing over time, constant IO is slower than one big IO read. And it feels dangerous not knowing exactly how much RAM your population run will use.

sgossage · 2025-12-18T17:00:58Z

thanks @maxbriel, I'll be doing some testing on HPC and at todays dev meeting we discussed preloading the single star grid tracks that are used during detached evolution prior to binary evolution which I will also look in to. This may help save on I/O overhead during the detached step.

for more information, see https://pre-commit.ci

lazy access for PSyGrid initial/final values to save RAM

7bd5a36

sgossage added the enhancement New feature or request label Dec 16, 2025

sgossage self-assigned this Dec 16, 2025

testing some fixes to binary evolution RAM usage (at the moment lengt…

53520fc

…hens run time due to I/O overhead, so not worth it yet

sgossage force-pushed the sg_fix_grid_data_ram_usage branch from dc52455 to 53520fc Compare December 16, 2025 21:44

pre-commit-ci bot and others added 3 commits December 16, 2025 21:44

[pre-commit.ci] auto fixes from pre-commit.com hooks

43444a7

for more information, see https://pre-commit.ci

Revert "testing some fixes to binary evolution RAM usage (at the mome…

f05e62f

…nt lengthens run time due to I/O overhead, so not worth it yet" This reverts commit 53520fc.

Merge branch 'sg_fix_grid_data_ram_usage' of github.com:POSYDON-code/…

63aef60

…POSYDON into sg_fix_grid_data_ram_usage

sgossage and others added 2 commits December 16, 2025 23:59

Testing out preloading root matrices and pre-training scalers in Trac…

1e81141

…kMatcher. Also trying a TrackMatcher that is accessible by all evolution steps. Not done correctly yet, just for testing.

[pre-commit.ci] auto fixes from pre-commit.com hooks

38abbe7

for more information, see https://pre-commit.ci

sgossage and others added 2 commits January 1, 2026 02:53

pre-load single star grid data

65b7cfa

[pre-commit.ci] auto fixes from pre-commit.com hooks

eeb5568

for more information, see https://pre-commit.ci

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[fix] Lazy access for PSyGrid initial/final values to save RAM #778

[fix] Lazy access for PSyGrid initial/final values to save RAM #778

Uh oh!

sgossage commented Dec 16, 2025 •

edited

Loading

Uh oh!

sgossage commented Dec 16, 2025 •

edited

Loading

Uh oh!

sgossage commented Dec 16, 2025 •

edited

Loading

Uh oh!

sgossage commented Dec 17, 2025 •

edited

Loading

Uh oh!

maxbriel commented Dec 18, 2025 •

edited

Loading

Uh oh!

sgossage commented Dec 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[fix] Lazy access for PSyGrid initial/final values to save RAM #778

Are you sure you want to change the base?

[fix] Lazy access for PSyGrid initial/final values to save RAM #778

Uh oh!

Conversation

sgossage commented Dec 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sgossage commented Dec 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sgossage commented Dec 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sgossage commented Dec 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

maxbriel commented Dec 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sgossage commented Dec 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

sgossage commented Dec 16, 2025 •

edited

Loading

sgossage commented Dec 16, 2025 •

edited

Loading

sgossage commented Dec 16, 2025 •

edited

Loading

sgossage commented Dec 17, 2025 •

edited

Loading

maxbriel commented Dec 18, 2025 •

edited

Loading