-
Notifications
You must be signed in to change notification settings - Fork 20
WIP: Adjust Resampling Logic and Defaults #1344
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: v1.x.x
Are you sure you want to change the base?
Changes from all commits
d0054da
ce03296
83c6774
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -9,7 +9,7 @@ | |
| import itertools | ||
| import logging | ||
| import math | ||
| from bisect import bisect | ||
| from bisect import bisect, bisect_left | ||
| from collections import deque | ||
| from datetime import datetime, timedelta, timezone | ||
| from typing import assert_never | ||
|
|
@@ -437,7 +437,7 @@ def resample(self, timestamp: datetime) -> Sample[Quantity]: | |
| ) | ||
| minimum_relevant_timestamp = timestamp - period * conf.max_data_age_in_periods | ||
|
|
||
| min_index = bisect( | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think the behavior how to resample, i.e. left or right open and the labeling should be config parameters.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we should make left or right opened configurable with the corresponding label, such as:
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think the latter is also reasonable options (see e.g. https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.resample.html), but don't see a strong reason to implement this now if not needed. If it's well-documented, the users can also adjust the timestamps trivially. So your proposal sounds good to me. |
||
| min_index = bisect_left( | ||
| self._buffer, | ||
| minimum_relevant_timestamp, | ||
| key=lambda s: s[0], | ||
|
|
@@ -458,7 +458,8 @@ def resample(self, timestamp: datetime) -> Sample[Quantity]: | |
| if relevant_samples | ||
| else None | ||
| ) | ||
| return Sample(timestamp, None if value is None else Quantity(value)) | ||
| sample_time = timestamp - conf.resampling_period | ||
| return Sample(sample_time, None if value is None else Quantity(value)) | ||
|
|
||
| def _log_no_relevant_samples( | ||
| self, minimum_relevant_timestamp: datetime, timestamp: datetime | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that this value is leading to unexpected behavior, but this would be a very intrusive change which would result in many more NaN values in deployments. If we want to change it we could make it a required parameter for migration purpose, and later introduce the new default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good.