-
Notifications
You must be signed in to change notification settings - Fork 1
Description
In the Usefulness feedback on the DQT, one of the checks reports that "106% of grants have Beneficiary location name found but no beneficiary location code"
Screenshot of the error returned by the tool:
Initial investigation from @mrshll1001
the main cause of this strange behaviour is that there are a number of duplicate grant rows in the spreadsheet which causes some double-counting on beneficiary location data (in some cases there might even be triple counting).
In the Standard, Beneficiary Location is an array, presumably so that we can model cases where a grant benefits people in multiple locations. When you upload a spreadsheet to the DQT it converts it to the JSON format, and for duplicate rows it merges these together into a single grant. Normally this wouldn't cause any problems at all since the later rows overwrite the earlier ones, but since Beneficiary Location is an array the conversion adds these together to create an array of duplicate locations. So we have grants with two or more locations which all say "GB" and "England".
After discussion with @michaelwood, we agreed that this is technically a software bug, and we should be counting errors as a proportion of the total number of possible errors that could be found for a given number of grants. However, this is an edge case (it hasn't come up before), so we're not prioritising it for an urgent fix.