Implement Urban Mental Health Model Option 2: LULC Inputs #2273

claire-simpson · 2025-12-10T20:39:09Z

Description

Implements LULC-based option (scenario 2) in the Urban Mental Health model and unifies NDVI preprocessing across scenarios.

Adds workflow for LULC reclassification to NDVI:

build LULC-to-NDVI mapping either from attribute table or calculate mean NDVI (from ndvi_base) by lucode if needed
auto-masking: map excluded LULC classes to NODATA
write lulc_to_ndvi_map.csv to store the final mapping used
reclassify LULC to mean NDVI

Other changes:

Refactor preprocessing so LULC and NDVI scenarios share a common align --> mask (and reclassify if lulc) --> buffer-mean --> delta_NDVI pipeline.
Validation for if ndvi column and base_ndvi both aren't provided for option 2
Add unit tests

Fixes #2142

Checklist

Updated HISTORY.rst and link to any relevant issue (if these changes are user-facing)
Updated the user's guide (if needed)
Tested the Workbench UI (if relevant)

…c ndvi map table and calculate avg ndvi per lucode and reclassify lulc tif natcap#2142

… ndvi column or raster natcap#2142

…nt model options natcap#2142

claire-simpson · 2025-12-12T22:35:26Z

src/natcap/invest/urban_mental_health.py

-            task_name="calculate delta ndvi"  # change in nature exposure
+        buffer_alt_dependencies.append(mask_alt_ndvi_task)
+
+    kernel_task = task_graph.add_task(


I'd recommend hiding whitespace when reviewing this, as most (all?) of the changes between kernel_task and zonal stats are just unindenting

emilyanndavis

Looks good!

emilyanndavis · 2025-12-17T19:28:19Z

src/natcap/invest/urban_mental_health.py

+                id="lulc_to_ndvi_csv",
+                path="intermediate/lulc_to_ndvi_map.csv",
+                about=gettext(
+                    "Table giving mean NDVI by LULC codes, with excluded LULC "


Suggested change

"Table giving mean NDVI by LULC codes, with excluded LULC "

"Table giving mean NDVI by LULC code, with excluded LULC "

emilyanndavis · 2025-12-17T23:35:09Z

src/natcap/invest/urban_mental_health.py

+    unique_lucodes, inverse_indices = numpy.unique(
+        masked_lulc.astype(numpy.int64), return_inverse=True)
+
+    sums = numpy.bincount(inverse_indices,
+                          weights=masked_ndvi.astype(numpy.float32))
+    counts = numpy.bincount(inverse_indices)
+    means = sums / counts
+
+    mean_ndvi_by_lulc_dict = {lucode: mean for lucode, mean in zip(
+        unique_lucodes, means)}


Wow, very clever solution! I've never used bincount before, so I had to go to the docs and puzzle over this for a bit before I could grasp exactly what's going on here. Not sure if it would be worth adding a comment to help explain it up front? On the one hand, a gloss might have helped me (though I probably would have looked it up in the docs as well); on the other hand, it already seems pretty clear for people who (unlike me) are fluent in numpy. Up to you. Just thought I'd mention it on the off-chance you were already considering adding a comment here.

Yes, great idea! It is definitely not completely intuitive so I'll add several comments

dcdenu4

Hey @claire-simpson , I didn't make it all the way through but wanted to provide some initial feedback in a timely manner!

dcdenu4 · 2025-12-18T15:35:44Z

src/natcap/invest/urban_mental_health.py

+    for input_raster, resample_method in raster_to_method_dict.items():
+        if args[input_raster]:
+            input_align_list.append(args[input_raster])
+            output_align_list.append(file_registry[input_raster+'_aligned'])


We've generally adopted f-strings for string concatenation. I think it's technically "faster" and can be more readable than using the plus operator.

dcdenu4 · 2025-12-18T15:39:24Z

src/natcap/invest/urban_mental_health.py

+            args=(args['lulc_attr_csv'],
+                  file_registry['lulc_to_ndvi_csv'],
+                  file_registry['lulc_base_aligned'],  # Note: won't get used if `ndvi` column in `lulc_attr_csv``
+                  file_registry['ndvi_base_aligned'] #TODO: will this get set to None if used doesn't input ndvi_base?


Is this still a TODO to work through for this PR or later?

Oops no, I just forgot to delete this. I determined that for args, if the user doesn't enter an optional input, it gets set to None, but the value of the file_registry[key] is the file path regardless of whether its actually used/created (so, not set to None).

dcdenu4 · 2025-12-18T15:55:12Z

src/natcap/invest/urban_mental_health.py

+        for lu, ndvi, exclude in zip(codes, ndvi_means, excludes):
+            if bool(exclude):
+                value_map[lu] = FLOAT32_NODATA
+            elif numpy.isfinite(lu):


What happens if this fails and there is a positive infinity or NaN value? Given the ModelSpec validation for this should we be worried about it here?

This is probably unnecessary. I think I added the isfinite check because I was finding that pandas was reading in an empty line at the end of the lulc csv.. but I couldn't figure out why then and now I can't reproduce this behavior.

dcdenu4 · 2025-12-18T15:56:22Z

src/natcap/invest/urban_mental_health.py

+    elif base_ndvi_path:
+        LOGGER.info("Using NDVI raster to calculate mean NDVI by LULC class.")
+        lulc_dict = {lu: ex for lu, ex in zip(codes, excludes)
+                     if numpy.isfinite(lu)}


This is used somewhere else too, but is it necessary to check numpy.isfinite here?

Same as above - I think I added the isfinite check because I was finding that pandas was reading in an empty line at the end of the lulc csv.

dcdenu4 · 2025-12-18T16:17:59Z

src/natcap/invest/urban_mental_health.py

+        Dict containing the mean NDVI values for each LULC class
+
+    """
+    # TODO: iterblocks?


Yep, we'll want to handle this a little differently to avoid loading the entire raster into memory with raster_to_numpy_array. A common way is using iterblocks and keeping a running average as you iterate. Habitat Quality does something like this where it counts the number of unique pixel values. But it could be altered to keep a running average too.

invest/src/natcap/invest/habitat_quality.py

Line 1119 in 98de13b

def _raster_pixel_count(raster_path_band):

Ok thanks I've implemented this!

… clean up natcap#2142

dcdenu4 · 2026-01-07T20:07:34Z

src/natcap/invest/urban_mental_health.py

            data_type=int,
            units=None,
            required="scenario=='lulc'",
+            # Allow lulc_alt for masking if using scenario 3


Are the scenarios numbered or would it be helpful to use the name descriptor instead?

dcdenu4

Hey @claire-simpson ! I took another pass and had some comments and suggestions.

dcdenu4 · 2026-01-07T20:16:36Z

src/natcap/invest/urban_mental_health.py

+            func=build_lulc_ndvi_table,
+            args=(args['lulc_attr_csv'],
+                  file_registry['lulc_to_ndvi_csv'],
+                  file_registry['lulc_base_aligned'],  # Note: won't get used if `ndvi` column in `lulc_attr_csv``


Suggested change

file_registry['lulc_base_aligned'], # Note: won't get used if `ndvi` column in `lulc_attr_csv``

file_registry['lulc_base_aligned'], # Note: won't get used if `ndvi` column in `lulc_attr_csv`

dcdenu4 · 2026-01-07T20:18:46Z

src/natcap/invest/urban_mental_health.py

+            args=(file_registry['lulc_base_aligned'],
+                  file_registry['lulc_to_ndvi_csv'],
+                  file_registry['ndvi_base_aligned_masked']),
+                  # this outputs raster to use as new NDVI to use going forward


This comments wording is a bit unclear and maybe unnecessary?

dcdenu4 · 2026-01-07T20:22:24Z

src/natcap/invest/urban_mental_health.py

+                file_registry['ndvi_base_aligned_masked'])["datatype"],
+            'target_nodata': pygeoprocessing.get_raster_info(
+                file_registry['ndvi_base_aligned_masked'])["nodata"][0]},
+        dependent_task_list=buffer_base_dependencies+[kernel_task],


Just a nit pick that this list concatenation has no spaces and line 794 has a space :)

dcdenu4 · 2026-01-07T20:29:53Z

src/natcap/invest/urban_mental_health.py

+        None
+
+    """
+    lulc_df = pandas.read_csv(lulc_attr_table)


I think there are some benefits to opening input tables using the MODEL_SPEC.get_input function. Here's an example from the carbon model.

invest/src/natcap/invest/carbon.py

Line 419 in 98de13b

carbon_pool_df = MODEL_SPEC.get_input(

This also might guard against the behavior you were seeing with trailing empty lines, etc!

dcdenu4 · 2026-01-07T20:34:06Z

src/natcap/invest/urban_mental_health.py

+
+        create_lulc_ndvi_csv_task = task_graph.add_task(
+            func=build_lulc_ndvi_table,
+            args=(args['lulc_attr_csv'],


I commented in the function as well but the MODEL_SPEC has some handy get_input methods that allow us to open CSV tables in a way consistent manner. I gave an example from the carbon model below.

dcdenu4 · 2026-01-08T14:01:18Z

src/natcap/invest/urban_mental_health.py

+        # return_inverse returns the indices of unique array
+        # e.g., for lulc array [1, 2, 2, 5]: unique_lucodes = [1, 2, 5]
+        # and inverse_indices = [0, 1, 1, 2]
+        unique_lucodes, inverse_indices = numpy.unique(


numpy.unique also has a return_counts parameter that might be a little more straightforward than using bincount?

dcdenu4 · 2026-01-08T14:12:37Z

src/natcap/invest/urban_mental_health.py

+        block_counts = numpy.bincount(inverse_indices)
+
+        # accumulate into global sums/counts
+        for lucode, sum, c in zip(unique_lucodes, block_sums, block_counts):


I think the bincount solution is pretty neat and not one that I would have thought of. For the sake of sharing ideas, since we ultimately need to iterate over the unique codes, another solution would be kind of our typical one:

for lucode in unique_lucodes: if lucode == nodata: continue ndvi_sum = numpy.sum(ndvi[lulc==lucode]) sum[lucode] = ndvi_sum etc...

One positive of this approach is that we might be able to reduce some of the above code that flattens the array and creates the nodata masks?

Although checking nodata blocks and continuing if there's no relevant data seems like a good thing to keep!

Using return_counts in np.unique is a good idea! However I do think this approach of iterating over the unique codes would be slower because, for each lucode, you're looking at the whole block to build lucode == nodata and sum. So I guess the tradeoff is the current approach being faster but the approach you proposed being more readable. What do you think?

dcdenu4 · 2026-01-08T14:19:52Z

src/natcap/invest/urban_mental_health.py

+    target_nodata = FLOAT32_NODATA
+
+    # Create lucode: ndvi dict
+    lulc_df = pandas.read_csv(mean_ndvi_by_lulc_csv, index_col='lucode')


For non model spec input CSVs, there is a utils function for opening CSV's that does some sanitization. Could be helpful even though this CSV is curated by the model.

invest/src/natcap/invest/utils.py

Line 293 in 98de13b

def read_csv_to_dataframe(path, **kwargs):

dcdenu4 · 2026-01-08T14:20:43Z

src/natcap/invest/urban_mental_health.py

+    # raise error if user enters lulc_attr_csv without 'ndvi' column and also
+    # doesn't provide base_ndvi raster
+    if args['scenario'] == 'lulc' and args.get('lulc_attr_csv'):
+        lulc_df = pandas.read_csv(args['lulc_attr_csv'])


Another use case where using the MODEL_SPEC helper functions to open this could be nice.

…g mean ndvi by lucode natcap#2142

claire-simpson added 5 commits December 10, 2025 13:36

Implement LULC option; reorganize execute; add functions to build lul…

6810316

…c ndvi map table and calculate avg ndvi per lucode and reclassify lulc tif natcap#2142

use utils reclassify; update task names; fix ndvi col check; validate…

e33250e

… ndvi column or raster natcap#2142

Add tests; update make_synthetic_data_and_params to work with differe…

9216246

…nt model options natcap#2142

Combining two redundant tests natcap#2142

37c8d13

Fix comments and other small errors natcap#2142

e4e7f79

claire-simpson commented Dec 12, 2025

View reviewed changes

Fix validation for lulc attr table natcap#2142

3e85543

claire-simpson marked this pull request as ready for review December 12, 2025 22:48

claire-simpson requested review from dcdenu4 and emilyanndavis December 12, 2025 22:48

emilyanndavis approved these changes Dec 18, 2025

View reviewed changes

dcdenu4 requested changes Dec 18, 2025

View reviewed changes

Use iterblocks for mean ndvi by lucode calc; fix lulc csv validation;…

b913777

… clean up natcap#2142

claire-simpson requested a review from dcdenu4 December 18, 2025 23:19

Fix iterblocks natcap#2142

16e46a6

dcdenu4 reviewed Jan 7, 2026

View reviewed changes

dcdenu4 requested changes Jan 8, 2026

View reviewed changes

claire-simpson added 2 commits January 9, 2026 15:01

Validated lulc table; improve comments; use return_counts when gettin…

4a8c204

…g mean ndvi by lucode natcap#2142

Fix comments for calculation of mean ndvi by lucode natcap#2142

97b2849

claire-simpson requested a review from dcdenu4 January 9, 2026 22:08

	"Table giving mean NDVI by LULC codes, with excluded LULC "
	"Table giving mean NDVI by LULC code, with excluded LULC "

	file_registry['lulc_base_aligned'], # Note: won't get used if `ndvi` column in `lulc_attr_csv``
	file_registry['lulc_base_aligned'], # Note: won't get used if `ndvi` column in `lulc_attr_csv`

Implement Urban Mental Health Model Option 2: LULC Inputs #2273

Are you sure you want to change the base?

Implement Urban Mental Health Model Option 2: LULC Inputs #2273

Uh oh!

Conversation

claire-simpson commented Dec 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Checklist

Uh oh!

Choose a reason for hiding this comment

Uh oh!

emilyanndavis left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dcdenu4 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dcdenu4 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

claire-simpson commented Dec 10, 2025 •

edited

Loading