Skip to content

add hysterisis to status change alerts #2

@timdrysdale

Description

@timdrysdale

Problem: Email alerts are generated too frequently, when checks fail and quickly un-fail. This could be exacerbated by the current setting of reports from relay every 2sec (to support CLI scripts, because practable/relay#57 ).
Screenshot from 2023-10-20 10-01-17

Desired solution:

  1. fix false negative tests, or ignore incoming data that would lead to short-term false negatives.
  2. Add a rate limit on email sending that is independently set in status, rather than being set by the rate at which relay and jump reports are received. Aggregate issues that arose during the last reporting period.
  3. consider adding information on the severity of outages, e.g. data was down for just <1sec, or more seriously >1min (one seems like noise, 10sec outage would affect user experience, while >1min is an actual outage or server issue that needs monitoring and ultimately fixing)

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions