Skip to content

Conversation

@mashalifshin
Copy link
Contributor

To date, we have not been using the optional data_sensitivity field in our metrics.yaml.

Until now it has been implicitly pretty clear what kind of data we're working with -- anything in the Interaction pings is interaction data, and anything in RequestStats style pings is technical, with metrics explicitly grouped under technical_operations.

While going through the data review process for adding DMA codes throughout the pings, which may have a higher sensitivity, reviewers noted that it would be helpful and timely to now explicitly include the data_sensitivity field throughout. This change addresses that feedback.

Please take a look and double check that I've set the correct values throughout.

@mashalifshin mashalifshin requested a review from a team as a code owner January 22, 2026 22:26
@mashalifshin mashalifshin changed the title docs: add explicit data_sensitivity values to all metrics docs: add data_sensitivity values to all metrics Jan 24, 2026
Copy link
Contributor

@dmueller dmueller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for doing this, we treated the optional field as optional because that was the simplest path

It is a bit confusing since it gets put on the metric, but the ping filling in the metric is more indicative of whether we should pick technical vs interaction; which I think could mean adding an existing metric to a new ping could change the sensitivity level.

Many of the properties of the ad (advertiser/creative id/product/format/flight id/line item id), those seem like they could be technical sensitivity but we're only recording them when an interaction occurred, so does that graduate them to interaction?

Similary which provider an ad was sourced from seems technical, but is it interaction because we record it when an impression/click occurs?

OS/Form factor can probably be technical? OS is mentioned as an example of technical data.

Repeating myself, but it's confusing because is it because of what triggers the pings to be emitted more than the field itself?

@mashalifshin
Copy link
Contributor Author

thanks for doing this, we treated the optional field as optional because that was the simplest path

It is a bit confusing since it gets put on the metric, but the ping filling in the metric is more indicative of whether we should pick technical vs interaction; which I think could mean adding an existing metric to a new ping could change the sensitivity level.

Many of the properties of the ad (advertiser/creative id/product/format/flight id/line item id), those seem like they could be technical sensitivity but we're only recording them when an interaction occurred, so does that graduate them to interaction?

Similary which provider an ad was sourced from seems technical, but is it interaction because we record it when an impression/click occurs?

OS/Form factor can probably be technical? OS is mentioned as an example of technical data.

Repeating myself, but it's confusing because is it because of what triggers the pings to be emitted more than the field itself?

Thanks for looking this over. Yeah it makes sense you didn't need them before, but this time the data review folks noted that it would be helpful to them to fill that in.

I agree, I was pretty unsure what the correct category should be for some of these, and don't feel super confident in my assignments. As you pointed out they could fit in both depending how you look at it.

As yeah just adding on to what you noted, it really seems to matter when the interaction occurs in the request cycle, and less so the individual piece of data (like i noticed from the last project that the provider-request-stats ping is "safer" from user data perspective because it isn't client initiated, even though it captures info about the client). It also seems to matter more how the data could be combined to id a unique user, and less so each individual bit ... (which I would hope we wouldn't be trying to do with our internal data anyway!)

@mashalifshin
Copy link
Contributor Author

OS/Form factor can probably be technical? OS is mentioned as an example of technical data.

I did update those to be technical since that seems pretty straightforward. Re the properties of the ad, and the provider, I'm unsure. I'll ask the data review folks what they think.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants