Add new config option maxBatchOutputSize to split batch into chunks before outputs by estolfo · Pull Request #18680 · elastic/logstash

estolfo · 2026-01-30T16:07:34Z

This PR adds a new config option, pipeline.batch.max_output_size which limits the number of events sent to outputs after filtering in a pipeline.
After events are filtered in a pipeline, they are sent in chunks of max_output_size to the outputs. This helps mitigate some OOM issues observed in the past due to a split filter exploding the size of batches.
Related to issue logstash-plugins/logstash-filter-split#48

Further work to add a config option limiting the chunk size in bytes could also help mitigate OOM issues, though this config option might be sufficient for the majority of cases.

github-actions · 2026-01-30T16:07:44Z

🤖 GitHub comments

Just comment with:

run docs-build : Re-trigger the docs validation. (use unformatted text in the comment!)
/run exhaustive tests : Run the exhaustive tests Buildkite pipeline.

mergify · 2026-01-30T16:08:22Z

This pull request does not have a backport label. Could you fix it @estolfo? 🙏
To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

backport-8./d is the label to automatically backport to the 8./d branch. /d is the digit.
If no backport is necessary, please add the backport-skip label

elasticmachine · 2026-01-30T16:34:07Z

💛 Build succeeded, but was flaky

Buildkite Build
Commit: ccc1f93

Failed CI Steps

:java: Java unit tests - FIPS mode

github-actions · 2026-02-09T15:42:48Z

🔍 Preview links for changed docs

… setting

Copilot

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Copilot

Pull request overview

Copilot reviewed 16 out of 16 changed files in this pull request and generated 7 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

docs/reference/running-logstash-command-line.md

logstash-core/src/test/java/org/logstash/config/ir/CompiledPipelineTest.java

logstash-core/src/main/java/org/logstash/config/ir/CompiledPipeline.java

Copilot · 2026-02-10T17:28:28Z

logstash-core/lib/logstash/environment.rb

   Setting::PositiveIntegerSetting.new("pipeline.batch.size", 125),
           Setting::NumericSetting.new("pipeline.batch.delay", 50), # in milliseconds
+           Setting::NumericSetting.new("pipeline.batch.max_output_size", 0), # 0 means unlimited
           Setting::BooleanSetting.new("pipeline.unsafe_shutdown", false),


pipeline.batch.max_output_size is registered as a NumericSetting, which accepts floats and negative numbers without validation (see org.logstash.settings.NumericSetting). Since this value is used as an event-count limit, it should be an integer and should be validated as >= 0 (with 0 meaning unlimited). Consider introducing a dedicated non-negative integer setting (e.g., a variant of PositiveIntegerSetting that allows 0) and using it here so invalid values fail fast at startup.

@jsvd what do you think about this? Should I create a new number type? I can't use PositiveIntegerSetting as that doesn't allow 0. The number provided is converted into an integer here

I wonder if we should envision a set of defaults like “unlimited” (no chunking), “batch_size” (default to batch size) or positive number?
I think we will want to support these three scenarios in the future and the decision now should not block us from that in the future.

When you say unlimited, what do you have in mind?
I thought unlimited was the same as batch size-- without a max_output_size set, the code will pass the whole batch to the outputs, unchunked.

I can see two "defaults":

"entire batch": all the events present in the batch object

"pipeline's default batch size": chunk based on the default batch size of the pipeline

If I have batch size == 1000 it'd be nice that my output section chunks along the 1000 mark. This is mostly about ergonomics: I could see that in a Logstash 10.x we would default to chunking to the default batch size of the pipeline.

I think there are other ergonomic factors to take into account here, where a max_output_size of 1000 would mean a 1001 batch would cause two "smaller batches" of 1000 and 1, causing a lot resource waste to send a single event batch

If there is a default batch size of the pipeline, how would the batch end up having more than that amount?
I'm seeing that there are always batchSize number of items read from the queue (here and here)

About the batches of 1000 and 1 items, how likely is that scenario? And do you think writing code to handle it is worth the overhead, when chunking would provide benefits in most cases?

jsvd · 2026-02-11T09:26:45Z

docs/reference/logstash-settings-file.md

 | `pipeline.workers` | The number of workers that will, in parallel, execute the filter and outputstages of the pipeline. This setting uses the[`java.lang.Runtime.getRuntime.availableProcessors`](https://docs.oracle.com/javase/7/docs/api/java/lang/Runtime.md#availableProcessors())value as a default if not overridden by `pipeline.workers` in `pipelines.yml` or`pipeline.workers` from `logstash.yml`.  If you have modified this setting andsee that events are backing up, or that the CPU is not saturated, considerincreasing this number to better utilize machine processing power. | Number of the host’s CPU cores |
 | `pipeline.batch.size` | The maximum number of events an individual worker thread will collect from inputs  before attempting to execute its filters and outputs.  Larger batch sizes are generally more efficient, but come at the cost of increased memory  overhead. You may need to increase JVM heap space in the `jvm.options` config file.  See [Logstash Configuration Files](/reference/config-setting-files.md) for more info. | `125` |
 | `pipeline.batch.delay` | When creating pipeline event batches, how long in milliseconds to wait for  each event before dispatching an undersized batch to pipeline workers. | `50` |
+| `pipeline.batch.max_output_size` | Maximum number of events that are passed together as a chunk to the outputs after being filtered. The default is 0 (unlimited). | `0` |


I wonder if unlimited can create some confusion..maybe we should frame it as: of the batch of events that passed through the filter section, how many to pass at time to the output section. 0 would then mean the entire batch.
also not super happy about how I worded it... 🤔

jsvd · 2026-02-11T09:35:02Z

logstash-core/src/main/java/org/logstash/config/ir/CompiledPipeline.java

+        final int maxChunkSize = (maxBatchOutputSize > 0) ? maxBatchOutputSize : totalSize;
+
+        // send to consumer in chunks
+        for (int offset = 0; offset < totalSize; offset += maxChunkSize) {


should we short circuit here when maxChunkSize is 0? i.e. will it still allocate a new array and put the entire batch on it for the unlimited case?

you and copilot are on the same page :) #18680 (comment)

You're right, I'm looking at changing the code now.

resolved in 39a4d7e, dd21464

elasticmachine · 2026-02-12T13:04:34Z

💚 Build Succeeded

Buildkite Build
Commit: f7f1e85

History

💛 Build #4328 was flaky dd21464
💚 Build #4321 succeeded e223039
💔 Build #4320 failed c8ee3bc
💚 Build #4310 succeeded c554bc6
💛 Build #4216 was flaky ccc1f93

estolfo added 12 commits January 30, 2026 17:08

Add config option max batch bytes

cac4f93

Change to chunk batches before sending to outputs

8411ec8

remove tests for maxbatchbytes

1167cc1

add missing parenthesis

91f533e

add missing newline

233a331

Use shared function for breaking up batch into chunks

f1fad28

Inline returning return value of compute

10b3b48

Add tests for maxBatchOutputSize

8180e38

Change some tests to count number of invocations to confirm chunking

bb490df

Update tests

97f0f0a

Minor updates

08ccedf

More minor updates

ccc1f93

estolfo force-pushed the split-filter branch from 8f0f6c6 to ccc1f93 Compare January 30, 2026 16:08

github-actions bot deployed to docs-preview February 9, 2026 15:42 View deployment

Make pipeline.batch.max_output_size available as cmd line and ENV var…

c554bc6

… setting

estolfo force-pushed the split-filter branch from 3de1d29 to c554bc6 Compare February 9, 2026 15:48

github-actions bot deployed to docs-preview February 9, 2026 15:49 View deployment

estolfo requested a review from Copilot February 9, 2026 16:15

Copilot AI reviewed Feb 9, 2026

View reviewed changes

Use alphabetical order for pipeline options

fc590a0

github-actions bot deployed to docs-preview February 10, 2026 11:39 View deployment

Adjust description of config option

c605f14

github-actions bot deployed to docs-preview February 10, 2026 11:43 View deployment

Update description in pipelines.yml

fdf6d8a

github-actions bot deployed to docs-preview February 10, 2026 11:44 View deployment

Add test when batch size > 1 and maxBatchOuputSize is > batch size

7905663

github-actions bot deployed to docs-preview February 10, 2026 12:03 View deployment

Add tests for ordered execution

c8ee3bc

github-actions bot deployed to docs-preview February 10, 2026 13:01 View deployment

estolfo added 2 commits February 10, 2026 14:41

Add test for setting batch max output size in a container

99a99d2

Add more docs

b47da6a

estolfo mentioned this pull request Feb 10, 2026

Add option to not apply split filter if new batch size approaches memory limit logstash-plugins/logstash-filter-split#50

Open

Fix tests

e223039

github-actions bot deployed to docs-preview February 10, 2026 15:18 View deployment

estolfo marked this pull request as ready for review February 10, 2026 17:18

estolfo requested a review from Copilot February 10, 2026 17:23

Copilot started reviewing on behalf of estolfo February 10, 2026 17:23 View session

Copilot AI reviewed Feb 10, 2026

View reviewed changes

jsvd reviewed Feb 11, 2026

View reviewed changes

Remove short flag for option

794b2c9

github-actions bot deployed to docs-preview February 11, 2026 09:28 View deployment

Switch assertEquals args, as expected should come first, then actual

c667965

github-actions bot deployed to docs-preview February 11, 2026 09:33 View deployment

fix formatting

afdccef

jsvd reviewed Feb 11, 2026

View reviewed changes

github-actions bot deployed to docs-preview February 11, 2026 09:35 View deployment

fix comment to match code

1ddf205

github-actions bot deployed to docs-preview February 11, 2026 09:36 View deployment

Don't create new ruby array when not chunking

39a4d7e

github-actions bot deployed to docs-preview February 11, 2026 10:02 View deployment

switch logic

dd21464

github-actions bot deployed to docs-preview February 11, 2026 10:06 View deployment

Update comment

f7f1e85

github-actions bot deployed to docs-preview February 12, 2026 12:35 View deployment

robbavey assigned estolfo Feb 12, 2026

Conversation

estolfo commented Jan 30, 2026

Uh oh!

github-actions bot commented Jan 30, 2026

🤖 GitHub comments

Uh oh!

mergify bot commented Jan 30, 2026

Uh oh!

elasticmachine commented Jan 30, 2026

💛 Build succeeded, but was flaky

Failed CI Steps

Uh oh!

github-actions bot commented Feb 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔍 Preview links for changed docs

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

elasticmachine commented Feb 12, 2026

💚 Build Succeeded

History

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

github-actions bot commented Feb 9, 2026 •

edited

Loading