-
Notifications
You must be signed in to change notification settings - Fork 100
953 stochastic noise #2275
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
953 stochastic noise #2275
Conversation
|
@maxwhitemet, could you merge/rebase master into your branch. The failing "CI Tests / Test-Coverage (pull_request)" should hopefully be addressed by the now merged #2273
There does appear to be a new issue potentially with the improver documentation building now though... (different issue) |
84d3c1b to
897e0cb
Compare
|
Ignore the test coverage failure. Fixed in #2282 |
gavinevans
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @maxwhitemet 👍
I've added some comments below.
envs/environment_b.yml
Outdated
| - geopandas | ||
| - lightgbm>=4.0.0 | ||
| - numba | ||
| - pysteps |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to say, we shouldn't be including pysteps in this environment. If this has been done to avoid unit test failures, then we just need to handle the potential lack of availability of pysteps in the environment by skipping the unit tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed.
envs/environment_b.yml
Outdated
| - geopandas | ||
| - lightgbm>=4.0.0 | ||
| - numba | ||
| - pysteps |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you update the docstring at the top of the file to record that the stochastic noise CLI isn't supported by this environment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added. Also added a couple full stops.
improver/cli/stochastic_noise.py
Outdated
| # | ||
| # This file is part of 'IMPROVER' and is released under the BSD 3-Clause license. | ||
| # See LICENSE in the root of the repository for full licensing details. | ||
| """CLI to add stochastic noise to a dependence template cube using SSFT.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to spell out SSFT here, as it's not a sufficiently well known concept i.e. Short-Space Fourier Transform.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added.
improver/cli/stochastic_noise.py
Outdated
| threshold_units: str = "mm/hr", | ||
| ): | ||
| """ | ||
| Add stochastic noise to a dependence template cube object using SSFT. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please spell out SSFT: Short-Space Fourier Transform.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added.
envs/conda-forge.yml
Outdated
| - scipy=1.15 | ||
| - sphinx<9.0.0 # https://github.com/tox-dev/sphinx-autodoc-typehints/issues/586 | ||
| - pandas | ||
| - pysteps |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be removed, as optional dependencies, like pysteps are intended to be included in this environment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed.
|
|
||
| num_workers = min(len(template.coord("realization").points), os.cpu_count() - 1) | ||
|
|
||
| results = compute(*delay_results, scheduler="threads", num_workers=num_workers) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to note, I don't think that we've used dask like this in the improver codebase before, although I think this usage makes sense. This might be something to discuss.
| # Set noise to zero where dependence template data is below threshold | ||
| # to prevent spurious wet pixels | ||
| dry_mask = template.data < self._convert_threshold_units(template) | ||
| noise.data[dry_mask] = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can see that this is different to the intended implementation with the intended implementation deliberately making the stochastic noise negative, so that it only has an impact for grid points where there is no precipitation in the templates. It would be good to discuss this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have made the change suggested, though made the scaling optional via a scale_subthreshold_noise parameter.
| [[3.5848932, 4.995262], [2.2589254, 6.5118866]], | ||
| [[3.859587, 5.289296], [2.5182567, 6.830268]], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there's a slight misunderstanding here, as the intended outputs here should just be the inputs (as none of the grid points in the input is 0 mm/hr).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have modified the plugin and tests following our conversations on my misunderstanding.
| assert result.data.dtype == np.float32 | ||
|
|
||
| # Verify noise was added (output differs from input) | ||
| assert not np.allclose(result.data, simple_cube.data) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably you should check what the numbers are here, rather than what they are not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adjusted.
|
|
||
| return output_cube | ||
|
|
||
| def process(self, dependence_template: Cube) -> Cube: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the acceptance tests have actually highlighted an issue where the units of the Cube being passed in, aren't being preserved, so although the metadata in the Cube says "m/s", the values are actually mm/hr. This should be corrected, so that the values put into the output cube are in the correct units.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is now fixed. Thank you for pointing this out.
maxwhitemet
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the review @gavinevans. I have now made the requested changes.
envs/conda-forge.yml
Outdated
| - scipy=1.15 | ||
| - sphinx<9.0.0 # https://github.com/tox-dev/sphinx-autodoc-typehints/issues/586 | ||
| - pandas | ||
| - pysteps |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed.
envs/environment_b.yml
Outdated
| - geopandas | ||
| - lightgbm>=4.0.0 | ||
| - numba | ||
| - pysteps |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed.
envs/environment_b.yml
Outdated
| - geopandas | ||
| - lightgbm>=4.0.0 | ||
| - numba | ||
| - pysteps |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added. Also added a couple full stops.
improver/cli/stochastic_noise.py
Outdated
| # | ||
| # This file is part of 'IMPROVER' and is released under the BSD 3-Clause license. | ||
| # See LICENSE in the root of the repository for full licensing details. | ||
| """CLI to add stochastic noise to a dependence template cube using SSFT.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added.
improver/cli/stochastic_noise.py
Outdated
| threshold_units: str = "mm/hr", | ||
| ): | ||
| """ | ||
| Add stochastic noise to a dependence template cube object using SSFT. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added.
| delayed(self.do_fft)(realization_data.astype(np.float32)) | ||
| ) | ||
|
|
||
| num_workers = min(len(template.coord("realization").points), os.cpu_count() - 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added.
|
|
||
| return output_cube | ||
|
|
||
| def process(self, dependence_template: Cube) -> Cube: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is now fixed. Thank you for pointing this out.
| # Set noise to zero where dependence template data is below threshold | ||
| # to prevent spurious wet pixels | ||
| dry_mask = template.data < self._convert_threshold_units(template) | ||
| noise.data[dry_mask] = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have made the change suggested, though made the scaling optional via a scale_subthreshold_noise parameter.
| [[3.5848932, 4.995262], [2.2589254, 6.5118866]], | ||
| [[3.859587, 5.289296], [2.5182567, 6.830268]], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have modified the plugin and tests following our conversations on my misunderstanding.
| assert result.data.dtype == np.float32 | ||
|
|
||
| # Verify noise was added (output differs from input) | ||
| assert not np.allclose(result.data, simple_cube.data) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adjusted.

Addresses #953.
This PR adds the plugin, CLI, and tests for stochastic noise generation.
The acceptance tests show the impact, with data available here.

Testing: