Fix memory leaking when processing large result files #181

GlucNAc · 2026-01-06T22:39:32Z

Hello :)

We have been using this great library to import large files (160 columns with tens of thousands of rows). Thus, it has generated large JSON result files, which cause out-of-memory errors when attempting to process them.

This PR:

adds tests that demonstrate an OOM issue occurs when processing 10 dummy files of ~20 MB each;
fixes Yokai\Batch\Storage\FilesystemJobExecutionStorage::query and Yokai\Batch\Storage\FilesystemJobExecutionStorage::count to reduce memory consumption.

While this is still not fully scalable, it might be worth considering a migration to something like SQLite when the filesystem is chosen to store job results.

Finally, I think it would be interesting to change the signature of Yokai\Batch\Storage\QueryableJobExecutionStorageInterface::query from iterable to Iterator.

Hope you think this is interesting :)

yann-eugone · 2026-01-07T08:02:16Z

Hi @GlucNAc
I really appreciate your efforts here

I known the filesystem job execution storage has some issues, it's more a default when nothing else is available, but I will always push users to alternatives, like the DoctrineDBALJobExecutionStorage

Can you be more specific when you say there is a OOM issue?
Were you able to spot the actual reason of the exhaustion?

I known we have a problem when JobExecution become to heavy, due to logs, warnings, failures, etc...
But the problem exists, no matter the storage used, so the fix is to be found elsewhere

GlucNAc · 2026-01-07T11:15:20Z

The problem has been found when many imports were done. These imports were KO and thus generated a lot of failures/warnings. However having too many failures/warnings per result file was not the real problem, it was more about the quantity of large json files to process. Implementing a rotating file policy functionnality would help to mitigate this accumulation problem, but the main problem would still remain.

Indeed, the main problem comes from the fact that Yokai\Batch\Storage\FilesystemJobExecutionStorage::query does not fully work in an iterator manner. In fact, the method do globally this :

public function query(Query $query): iterable
{
    $candidates = [];
    $glob = new \GlobIterator(
        \implode(DIRECTORY_SEPARATOR, [$this->directory, '**', '*']) . '.' . $this->serializer->extension(),
    );
    /** @var \SplFileInfo $file */
    foreach ($glob as $file) {

        // do filtering stuff

        $candidates[] = $execution;
    }

    $order = // select order callback

    if ($order) {
        \uasort($candidates, $order);
    }

    return \array_slice($candidates, $query->offset(), $query->limit());
}

This implementation leads to lose the benefits from the GlobIterator by accumulating JobExecution objects in $candidates array.

Filtering an iterator is possible, but performing ordering and offset logic is not. That's why I proposed to use SQLite in order to not maintain this logic and delegate it to the SQLite engine for users who needs to stay on the filesystem, but that would be not possible to introduce such a paradigm move in a fix PR.

That's why I manage to fix the current implementation to limit the memory consumption.

As the problem comes from accumulation of large json, to be able to detect the problem via unit test, I have generated 10 json files of 20 Mo each (with many warnings) and written these tests that detect OMM issue.

This class performs 3 tests :

testNoOOMOnCount;
testNoOOMOnList;
testNoOOMOnQuery;

which were run on the current state of the code.

The test 2 relies on Yokai\Batch\Storage\FilesystemJobExecutionStorage::list, which is a "true" iterator and then does not lead to OOM issue, which prove that the json file size is not the main issue (even if a gigabyte file would obviously make this test to fail).

On the other hand, the test 1 and 3 fail because they both rely on the previously describe Yokai\Batch\Storage\FilesystemJobExecutionStorage::query method.

To fix it, I have implemented a sort of lazy loading in order to just load the data necessary for the sorting and pagination logics, and then JobExecution are processed one by one by a "true" iterator like in Yokai\Batch\Storage\FilesystemJobExecutionStorage::list, correctly filtered, sorted and paginated (according to existing unit tests that are still OK + new pagination tests).

This lazy loading is done by JsonJobExecutionPartialSerializer, which reads the first 2048 bytes of the json and denormalizes the data into a new instance of JobExecution using the existing Yokai\Batch\Serializer\JsonJobExecutionSerializer::fromArray method (moved into Yokai\Batch\Serializer\JobExecutionSerializerTrait for this purpose).

Then the filtering and sorting logics are performed by the new private method Yokai\Batch\Storage\FilesystemJobExecutionStorage::getResultFiles (a better name could be found I think), which is consumed by Yokai\Batch\Storage\FilesystemJobExecutionStorage::query and Yokai\Batch\Storage\FilesystemJobExecutionStorage::count which implement their own pagination logic and keep memory usage as low as possible.

In our original use case, processing the result files was long (> 10 s), and was resulting in 1,6 Go RAM consumption. With that fix, the time was acceptable and the memory peak usage didn't overcome 100 Mo in a dev env (with Symfony debug ON).

yann-eugone · 2026-01-07T13:13:48Z

Than you for all the details you shared

I understand the problem you have and it is all around the query method
This method is part of an independent interface, that was not implemented on the filesystem storage in the beginning
But because we found a working (but suboptimal) implementation, we decided to keep it like that

Unfortunately, as you said, ordering + paginating an iterator is not possible, so for the moment, we have to deal with this
I see your point with sqlite, but once again, if you have database capabilities, just use the dbal storage, because it also works with sqlite (tests are running with sqlite)

IMO, anyone using the filesystem storage should not be using these methods, unless you know exactly what you are doing

I'm also not a fan of "partial hydration"
Maybe the problem is that in a listing method, we should hydrate smaller objects, not the whole JobExecution
It is essentially the same you are doing here, but with a different result object
WDYT ?

GlucNAc · 2026-01-07T13:34:49Z

I totaly agree with you about having a smaller object. It was my first intention with JobExecutionResultFile, but I was facing code duplication while deserializing the json, and as this logic seems to be already duplicated in JobExecutionRowNormalizer, I didn't wanted to go further in a refacto before being sure about your interest in it.

yann-eugone · 2026-01-07T13:44:49Z

I'm on the process to release a new version with BC, so, it's OK if we introduce such changes if it's for the best of the library

GlucNAc · 2026-01-07T14:19:41Z

OK I will propose such a refacto then. Maybe in another PR + updating the target of the current PR to this new one?

Btw, will your new release include the update of the return of Yokai\Batch\Storage\QueryableJobExecutionStorageInterface::query from iterable to Iterator? I think that would be a good improvement as it would encourage any implementation to use "real" iterator, and it will prevent some errors like "already closed iterator" (as developers would be aware an iterator is returned) and give opportunity to use iterator's function safely on the result of query method.

codecov · 2026-01-07T14:56:32Z

Codecov Report

❌ Patch coverage is 95.47511% with 10 lines in your changes missing coverage. Please review.
✅ Project coverage is 99.32%. Comparing base (96533f2) to head (e917297).

Files with missing lines	Patch %	Lines
...c/Serializer/JsonJobExecutionPartialSerializer.php	85.36%	6 Missing ⚠️
...atch/src/Storage/FilesystemJobExecutionStorage.php	95.95%	4 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff              @@
##                0.x     #181      +/-   ##
============================================
- Coverage     99.68%   99.32%   -0.37%     
- Complexity      940      962      +22     
============================================
  Files           157      160       +3     
  Lines          2852     2951      +99     
============================================
+ Hits           2843     2931      +88     
- Misses            9       20      +11

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

github-actions bot added yokai/batch yokai/batch-symfony-framework labels Jan 6, 2026

GlucNAc force-pushed the fix/omm_kill branch from b388693 to af182e3 Compare January 7, 2026 09:36

Add pagination tests for job execution storage

b8874c7

GlucNAc force-pushed the fix/omm_kill branch from af182e3 to c798d89 Compare January 7, 2026 11:46

GlucNAc force-pushed the fix/omm_kill branch from c798d89 to b96fe4a Compare January 7, 2026 13:35

Add tests that prove OOM issue can occur with large result files

59c5d5d

GlucNAc force-pushed the fix/omm_kill branch from b96fe4a to 19935af Compare January 7, 2026 14:06

GlucNAc added 2 commits January 7, 2026 15:40

Fix OOM issue on large result files

7daeb49

Add partial serializer configuration and service definition

e917297

GlucNAc force-pushed the fix/omm_kill branch from 19935af to e917297 Compare January 7, 2026 14:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix memory leaking when processing large result files #181

Fix memory leaking when processing large result files #181

Uh oh!

GlucNAc commented Jan 6, 2026

Uh oh!

yann-eugone commented Jan 7, 2026

Uh oh!

GlucNAc commented Jan 7, 2026 •

edited

Loading

Uh oh!

yann-eugone commented Jan 7, 2026

Uh oh!

GlucNAc commented Jan 7, 2026

Uh oh!

yann-eugone commented Jan 7, 2026

Uh oh!

GlucNAc commented Jan 7, 2026

Uh oh!

codecov bot commented Jan 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fix memory leaking when processing large result files #181

Are you sure you want to change the base?

Fix memory leaking when processing large result files #181

Uh oh!

Conversation

GlucNAc commented Jan 6, 2026

Uh oh!

yann-eugone commented Jan 7, 2026

Uh oh!

GlucNAc commented Jan 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yann-eugone commented Jan 7, 2026

Uh oh!

GlucNAc commented Jan 7, 2026

Uh oh!

yann-eugone commented Jan 7, 2026

Uh oh!

GlucNAc commented Jan 7, 2026

Uh oh!

codecov bot commented Jan 7, 2026

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

GlucNAc commented Jan 7, 2026 •

edited

Loading