Split filters can sometimes cause Out of Memory errors if they significantly increase the size of a pipeline batch. This is either due to a filter multiplying the number of events or to the filter increasing the size of individual events.
We've added a config option pipeline.batch.max_output_size to Logstash that, when set, will divide a batch into chunks to be sent to the outputs. This mitigates some OOM errors but work could also be done in the split filter itself to mitigate the errors.
One idea is to call estimateMemory on each event and then multiply that by the split factor, and compare the number to memory size. We could refuse to split if event_size*factor > 50% heap size
This idea would add an overhead but it might be worth it.