Skip to content

add trace log to track event size expansion#49

Merged
kaisecheng merged 6 commits intologstash-plugins:mainfrom
kaisecheng:add_event_size_trace_log
Nov 19, 2025
Merged

add trace log to track event size expansion#49
kaisecheng merged 6 commits intologstash-plugins:mainfrom
kaisecheng:add_event_size_trace_log

Conversation

@kaisecheng
Copy link
Contributor

@kaisecheng kaisecheng commented Nov 18, 2025

tested the following pipeline bin/logstash -f split.conf --log.level trace

input {
    generator {
        lines => [
            '{"kubernetes" : {"label": {"app": "somevalue" }}, "split": ["1","2","3"] }'
        ]
        count => 1
        codec => json
    }
}
filter { split { field => "split" }}
output { stdout { } }

log

[2025-11-18T15:29:49,196][TRACE][logstash.filters.split   ][main][e42465377ab12961a4ebbb80634862df2eacd4cc029b3817f2a772eb317df8e7] Event is split into 3 {:split_bytes=>819, :original_bytes=>261, :ratio=>3.14}

Copy link
Contributor

@donoghuc donoghuc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this will be helpful. I wonder if there is a similar log message we should add pre split detailing that we are going to be attempting to split into splits.size events. That way if there is a OOM or something during inflation we would have a log message indicating that the split is being attempted?

Co-authored-by: Cas Donoghue <cas.donoghue@gmail.com>
@kaisecheng
Copy link
Contributor Author

@donoghuc Thanks for your considerate suggestion. I have updated the log message and removed an overly verbose debug entry that didn’t add much value. Please have a look.

@kaisecheng kaisecheng requested a review from donoghuc November 18, 2025 18:20
event_target = @target.nil? ? @field : @target

split_bytes = 0
logger.trace? && logger.trace("Event being split into #{splits.size} events")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just out of curiosity... Why are we guarding this with logger.trace? I get it for the other computations as they are somewhat expensive. If i understand correctly, the logging library will handle which messages to actually emit based on level. For example, a message sent with logger.trace('foo') would only ever show up in logs when the trace level logging is configured.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guard it to prevent paying the cost of string concatenation string + #{splits.size} + string that is not used in other info or debug level.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guard it to prevent paying the cost of string concatenation string + #{splits.size} + string that is not used in other info or debug level.

@kaisecheng kaisecheng merged commit 74051b1 into logstash-plugins:main Nov 19, 2025
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants