-
Notifications
You must be signed in to change notification settings - Fork 102
feat(pubsub): pause the publisher for an ordering key when there is an error #4286
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #4286 +/- ##
==========================================
- Coverage 94.85% 94.82% -0.04%
==========================================
Files 187 187
Lines 7194 7207 +13
==========================================
+ Hits 6824 6834 +10
- Misses 370 373 +3 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
dbolduc
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't get to the changes in worker.rs
Consider doing the error type refactor first, then the unit tests look more like what we want.
|
|
||
| /// Publish is paused for the ordering key. | ||
| #[error("the ordering key was paused")] | ||
| OrderingKeyPaused(()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Usability question, which is not trivial, so feel free to think about it later:
Publish consumes the application's PubsubMessage. If the operation fails, what should the application do to resend the message? Would they need to hold onto a clone of the message until the operation is complete?
It would be super convenient if we could give them their message back. The plumbing on our end could be brutal, but the application would appreciate it. 🤷
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
our GAPICs all have this problem too.... idk if publishing is special. Something seems wrong about just dropping their message without trying to send it. But maybe I am worrying too much about the unhappy case. 🤷
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed.
Ideally, in the error case we should give the user the message back. IMO, we can do this in 2 ways:
- Keep a clone of it internally and pass it back to the user if there is a failure.
- Augment generated code/error to pass the message back if there is an error during Send.
I think it's a bigger discussion that should be left out of this PR.
| } | ||
| } | ||
|
|
||
| fn convert_error(e: crate::error::PublishError) -> gax::error::Error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comment: ah, I think you are hesitating to change Output to a Result<String, PublishError> because then we have to update all the code downstream of this.
Ok.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really yucky though. It might have been nicer to change the error type first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should discuss as a group with @suzmue if returning gax::error::Error or Publish error directly is ideal. We decided to with gax::error::Error for now as it is consistent with the other clients.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It feels so wrong to me, but we do have a precedence for just throwing something in an Error::io
google-cloud-rust/src/storage/src/storage/bidi/worker.rs
Lines 148 to 149 in d51e1ca
| // TODO(#3626) - reconsider the error kind. | |
| result.map_err(crate::Error::io) |
Although in GCS, I think there was a more compelling reason. (We wanted to reuse the ReadObjectResponse type which was in terms of gax::Error.)
PhongChuong
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review.
|
|
||
| /// Publish is paused for the ordering key. | ||
| #[error("the ordering key was paused")] | ||
| OrderingKeyPaused(()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed.
Ideally, in the error case we should give the user the message back. IMO, we can do this in 2 ways:
- Keep a clone of it internally and pass it back to the user if there is a failure.
- Augment generated code/error to pass the message back if there is an error during Send.
I think it's a bigger discussion that should be left out of this PR.
| } | ||
| } | ||
|
|
||
| fn convert_error(e: crate::error::PublishError) -> gax::error::Error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should discuss as a group with @suzmue if returning gax::error::Error or Publish error directly is ideal. We decided to with gax::error::Error for now as it is consistent with the other clients.
Pause the Publisher when we encounter an error when there is a send error.
When an error is encountered for a pending batch, we:
A resume operation will be added in a later PR.
This PR also introduce PublishError. Further work is needed to handle error propagation more fully