Skip to content

106% of grants are missing a useful feature when there are duplicate grants #164

@mariongalley

Description

@mariongalley

In the Usefulness feedback on the DQT, one of the checks reports that "106% of grants have Beneficiary location name found but no beneficiary location code"

Screenshot of the error returned by the tool:

Image

Initial investigation from @mrshll1001

the main cause of this strange behaviour is that there are a number of duplicate grant rows in the spreadsheet which causes some double-counting on beneficiary location data (in some cases there might even be triple counting).

In the Standard, Beneficiary Location is an array, presumably so that we can model cases where a grant benefits people in multiple locations. When you upload a spreadsheet to the DQT it converts it to the JSON format, and for duplicate rows it merges these together into a single grant. Normally this wouldn't cause any problems at all since the later rows overwrite the earlier ones, but since Beneficiary Location is an array the conversion adds these together to create an array of duplicate locations. So we have grants with two or more locations which all say "GB" and "England".

After discussion with @michaelwood, we agreed that this is technically a software bug, and we should be counting errors as a proportion of the total number of possible errors that could be found for a given number of grants. However, this is an edge case (it hasn't come up before), so we're not prioritising it for an urgent fix.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions