Skip to content

Comments

Optional quantities#604

Open
steven-murray wants to merge 6 commits intomainfrom
optional-quantities
Open

Optional quantities#604
steven-murray wants to merge 6 commits intomainfrom
optional-quantities

Conversation

@steven-murray
Copy link
Member

This PR adds the general ability for us to have "optional quantities" that can be output for the user, but otherwise are not required. For starters, I've added the three easiest of these (mean free path, kinetic_temp_neutral and kinetic_temperature).

This adds a new InputStruct, but since it is a simple one, old boxes and templates should all read in just fine.

@steven-murray steven-murray self-assigned this Feb 13, 2026
@steven-murray steven-murray added context: python wrapper Changes predominantly concerning the python wrapper context: C backend Changes occur predominantly in the C code type: performance: memory Performance improvements that reduce memory usage priority: medium Medium priority type: feature: ui New feature that adds functionality for the user labels Feb 13, 2026
@codecov
Copy link

codecov bot commented Feb 13, 2026

Codecov Report

❌ Patch coverage is 80.76923% with 5 lines in your changes missing coverage. Please review.
✅ Project coverage is 88.08%. Comparing base (ed15156) to head (57cb171).
⚠️ Report is 1 commits behind head on main.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
src/py21cmfast/wrapper/outputs.py 66.66% 0 Missing and 5 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #604      +/-   ##
==========================================
- Coverage   88.13%   88.08%   -0.06%     
==========================================
  Files          32       32              
  Lines        4729     4749      +20     
  Branches      796      802       +6     
==========================================
+ Hits         4168     4183      +15     
  Misses        401      401              
- Partials      160      165       +5     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

HII_D_PARA};

unsigned long long int idx;
if (optional_quantities_global->kinetic_temperature) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Notice that the sole purpose of set_ionized_temperatures is to determine the kinetic_temperature, so it is best to put this condition when set_ionized_temperatures is called (from ComputeIonizedBox) rather than to place this condition in set_ionized_temperatures.

growth = get_growth_factor(inputs=inputs, redshift=redshift)
return get_delta_crit_nu(inputs.matter_options.cdict["HMF"], sigma, growth)
return get_delta_crit_nu(
inputs.matter_options.cdict["HMF"], float(sigma[0]), growth
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change was made?



@attrs.define(kw_only=True, frozen=True)
class OptionalQuantities(InputStruct):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it really need to be a sub-class of InputStruct?

Copy link
Contributor

@jordanflitter jordanflitter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @steven-murray ! I made some minor comments/questions. My major comments/questions are the following:

  1. I understand the overall logic, but I wish, if it’s possible, to include the optional quantities in a more elegant way. For example, in the (very near) future, I’d like to include MANY optional quantities from SpinTemperatureBox.c (e.g. x_alpha, Lya flux, tau_GP, heating/cooling rates, essentially all the physical quantities that the code computes), as it will be very illuminating to see how all of these IGM’s properties evolve over time (with run_global_evolution). If we keep specifying manually each possible optional quantities, we will end up with a struct that contains A LOT of fields, where most of them will be discarded (unless we run_global_evolution). So what I have in my mind is that run_coeval will have a quantities argument (similarly as run_lightcone). Then, based on the user’s specified quantities + required quantities from each output struct, we will have a list of all the quantitates that we do want to keep. There’s a different question now of how to pass that list elegantly to the backend. One idea that I have from the top of my had is to send the backend a long string that contains all the desired fields, separated by a dash (or any other separator you have in mind), e.g. “neutral_fraction-brightness_temp”. Then the C code could easily understand if a given quantity should be stored as an output or not. Similarly, the wrapper will understand if it needs to allocate memory for that quantity. What I’m suggesting here might be an overkill for this preliminary PR, especially when considering its urgency, but maybe something between the current implementation and this idea could be done.
  2. I don’t understand why kinetic_temp_neutral from TsBox is an optional quantity. If USE_TS_FLUCT=True, that quantity is absolutely necessary for the evaluation of the spin temperature. Otherwise, if USE_TS_FLUCT=False, no TsBoxes are made, and the C code in IonisationBox.c already knows what to do when kinetic_temp_neutral is not available.

@steven-murray
Copy link
Member Author

Thanks @jordanflitter for your comments. I think it would be wise, given your comments, to back off on this PR until we have a better idea of how it should work.

In particular, the benefits of having the optional boxes be specified as an InputStruct are:

  • These parameters automatically get written into the file cache and hashes, so if someone runs run_coeval (for example) without mean_free_path and later wants to compute mean_free_path, a new box will be created automatically.
  • The inputs already get passed on through to the OutputStruct class which handles initialization of the Array attributes. This made it easy (in my implementation) to only optionally allocate these arrays.

OTOH, the benefits of having the boxes be specified directly to run_coeval and run_lightcone are:

  • This is much more familiar feeling to the user (it doesn't seem like specifying which quantities are output should affect the simulations themselves, or be encoded into the "definition" of the simulations.
  • It is also more similar to how we are already handling which quantities to convert into lightcones (within the Lightconer), and avoids there being two places to define "lists of quantities" that you want to "get out" of the simulation.

I think out of these considerations, one that is an absolute must is that users should not be able to get into the situation in which they run a simulation, and then later decide they actually want mean_free_path, and when they re-run the simulation it just silently ignores their request by reading in from the cache.

This situation is handled "for free" if we specify the optional quantities as InputStructs. To handle this situation if these were instead passed as lists of strings (or similar) to run_coeval/lightcone, I can think of the following solutions:

  1. Have a function (called within run_coeval) that checks the cache against the list of strings and simply errors if there exists a cached item of the same parameters but without the quantity desired. This error could just tell the user to use a different cache directory, or set regenerate=True. However this is suboptimal because it would require also regenerating ICs and PFs which shouldn't need to be regenerated. It's also a little obnoxious.
  2. We could secretly have the OptionalQuantities as an InputStruct, and just overwrite this silently when running run_coeval and run_lightcone etc. So the "real" inputs used aren't exactly what the user though they input, but we just never advertise the fact that these are in the InputParameters struct. This is probably better than (1) but we have to be mindful that the user should also be able to specify these optional quantities to single-field functions, which would then have to go and do their own updating of inputs as well, and this might lead to weird cache inconsistencies... maybe not though.
  3. Maybe a better option would be to separate all "optional" quantities out from their OutputStruct and big top-level C-functions entirely. For example, make mean_free_path its own small function that takes IonizedBox as an input. These could then be computed in a loop at the end of the run_coeval main loop, based on the user's input list of optional quantities to run_coeval. If the user ran the simulation again with the same formal InputParameters but changed the optional quantities, this loop would just re-run and update the data with the new quantities.

Actually, I think (3) is the clear winner.... what do you think @jordanflitter ?

@jordanflitter
Copy link
Contributor

Thanks for the very impressive comment ;)

So I totally agree that (2) is better than (1), for several reasons: obnoxious implementation of for the developers, telling the users to switch cache directory (pretty annoying).

My problem with (3) is that while currently it can work, let's say with fields like mean_free_path that could be post-processed, it won't be able to work on the one hand with fields that are not necessary for the output (for the evaluation of the brightness temperature ultimately), but are necessary for the evaluation of fields that do go to the output (e.g. x_alpha, J_alpha, and everything else that ComputeTsBox computes, they are all required for computing the spin temperature). So these fields cannot be post-processed, like mean_free_path, since they are indispensable part of the calculation.

I therefore vote for (2). In my opinion, I think it's perfectly fine that optional_quantities would be a "secret" field of inputs. After all, we already have a similar secret field with cosmo_tables.

After thinking about this a little bit more, maybe we can have a "ghost" inputs in run_coeval/lightcone (and in any of our single_field functions), that will take the user's inputs, and attrs.evolve the optional_quantities field according to the corresponding quantities argument that the user provides to the function (if quantities=None, no attrs.evolve will be made, and the ghost inputs will be the exact same as the user's inputs). Then, the hash table for the cache will include optional_quantitites (from the ghost inputs!), so if a new optional quantity had been added by the user, the C-function will be executed.

@steven-murray
Copy link
Member Author

@jordanflitter I am not sure I understand your reservations about (3).

I think the point of "optional" quantities is that they are optional -- anything indispensable for the rest of the calculations must be computed as normal. Only these end-results that don't contribute to anything else should be treated in this way.

There is I suppose the potential for there to be quantities that are required to compute some other optional quantity only. In this case, they are required if the user wants the downstream optional quantity, but otherwise could be considered their own optional quantity. Maybe this is what you were getting at. I'm not quite sure how to deal with this.

I also don't particularly like (2). It's better than (1), but it doesn't feel right to switch out the user's inputs from under them. Also, right now the only optional things are in the astro boxes, which means we only have to invalidate that cache (and can reuse the ICs and PFs even when the optional quantities are changed). However, in the future it's possible we'll also have density-field optional quantities (?) and then we'll have to think of some API that's a bit more modular.

The other reason I don't particularly like (2) is that re-running just to get a couple of extra optional quantities seems... wsateful. And also the whole new box, with all the same quantities (plus a couple of extra) will be re-written to disk, instead of just adding the new quantities to the existing file (or their own file).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

context: C backend Changes occur predominantly in the C code context: python wrapper Changes predominantly concerning the python wrapper priority: medium Medium priority type: feature: ui New feature that adds functionality for the user type: performance: memory Performance improvements that reduce memory usage

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants