Conversation
- create global aggregator - add data types for outcomes, discard events & client report - create discard reason enum
add functionality to: - attach client reports to existing envelopes - for the async transports the background routines also get invoked with a ticker to periodically send client reports
Semver Impact of This PR🟡 Minor (new features) 📋 Changelog PreviewThis is how your changes will appear in the changelog. Breaking Changes 🛠
New Features ✨
Bug Fixes 🐛
Internal Changes 🔧Deps
Other
🤖 This preview updates automatically when you update the PR. |
488cb9b to
85a8e5a
Compare
| recordSpanOutcome(t.reporter, report.ReasonRateLimitBackoff, event) | ||
| return | ||
| } | ||
|
|
||
| request, err := getRequestFromEvent(ctx, event, t.dsn) | ||
| request, err := getRequestFromEvent(ctx, event, t.dsn, t.reporter) | ||
| if err != nil { | ||
| t.reporter.RecordOne(report.ReasonInternalError, category) | ||
| recordSpanOutcome(t.reporter, report.ReasonInternalError, event) | ||
| return | ||
| } | ||
|
|
There was a problem hiding this comment.
Bug: reporter.TakeReport() is called before the HTTP request is sent. If the request fails, the client reports are lost because they were already removed from the aggregator.
Severity: HIGH
Suggested Fix
Only call reporter.TakeReport() and attach the client report to the envelope after confirming the HTTP request was successful. Alternatively, if the request fails, re-add the taken report back to the aggregator so it can be sent with a future request.
Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.
Location: transport.go#L743-L759
Potential issue: In both `HTTPTransport` and `HTTPSyncTransport`, client reports are
retrieved from the aggregator via `reporter.TakeReport()` and attached to the event
envelope before the HTTP request is attempted. If this request subsequently fails due to
a network error, a server-side error (like a 500), or any other reason, the reports are
permanently lost. They have been removed from the aggregator but were never successfully
delivered, leading to under-reporting of dropped events.
| response, err := t.client.Do(request) | ||
| if err != nil { | ||
| debuglog.Printf("There was an issue with sending an event: %v", err) | ||
| t.reporter.RecordOne(report.ReasonNetworkError, category) | ||
| recordSpanOutcome(t.reporter, report.ReasonNetworkError, event) | ||
| return |
There was a problem hiding this comment.
Bug: The asynchronous HTTPTransport fails to record a ReasonNetworkError in client reports when an HTTP request fails due to a network issue, unlike the synchronous transport.
Severity: MEDIUM
Suggested Fix
In the HTTPTransport worker, when t.client.Do(item.request) returns an error, call t.reporter.RecordOne(report.ReasonNetworkError, category) to record the dropped event before continuing the loop, mirroring the behavior of HTTPSyncTransport.
Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.
Location: transport.go#L768-L773
Potential issue: The asynchronous `HTTPTransport` does not record a client report when a
network error occurs during an event submission. The `t.client.Do(item.request)` call is
followed by a `continue` statement in the error path, but there is no call to
`t.reporter.RecordOne(report.ReasonNetworkError, ...)`. This leads to an under-counting
of dropped events caused by network issues. The synchronous `HTTPSyncTransport`
correctly handles this case, creating an inconsistency between the two transports.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.
| atomic.AddInt64(&b.dropped, int64(droppedCount)) | ||
| if b.onDropped != nil { | ||
| for _, di := range oldestBucket.items { | ||
| b.recordDroppedItem(di) |
There was a problem hiding this comment.
Dropped items not recorded when callback is unset
Medium Severity
In BucketedBuffer.handleOverflow for OverflowPolicyDropOldest, the recordDroppedItem calls for old bucket items are nested inside the if b.onDropped != nil block. Since onDropped is only ever set in tests (via SetDroppedCallback), these client report recordings never execute in production. Compare with RingBuffer, where recordDroppedItem is correctly placed outside the onDropped guard, and with the other overflow cases in BucketedBuffer itself (OverflowPolicyDropNewest, default) which also correctly record outside the guard.
| transport: transport, | ||
| dsn: dsn, | ||
| sdkInfo: sdkInfo, | ||
| reporter: report.GetAggregator(dsn.String()), |
There was a problem hiding this comment.
Scheduler registry lookup uses normalized DSN key
Medium Severity
The Scheduler looks up the aggregator using dsn.String() (the parsed/reconstructed DSN), but the Client registers it using options.Dsn (the raw user-provided string). If the original DSN contains a redundant default port (e.g. https://key@host:443/1), Dsn.String() normalizes it to https://key@host/1, causing the lookup to return nil. The scheduler would then silently skip all client report recording and periodic sending. The ring buffers in setupTelemetryProcessor correctly use client.options.Dsn, making this inconsistency specific to the scheduler.


Description
add support for client reports.
Issues
Changelog Entry Instructions
To add a custom changelog entry, uncomment the section above. Supports:
For more details: custom changelog entries
Reminders
feat:,fix:,ref:,meta:)