-
Notifications
You must be signed in to change notification settings - Fork 3
Description
I started a new thread because this is a sizeable change, enough that I put it in its' own feature branch
### What and Why
Most of us have a few cores to spare. FFMPEG is not taking advantage of that. If you have 1 monolith file, it could time out during translation, cannot be restarted at any point and doesn't allow for maximizing use of computer resources.
To that end MonkeyPlug was enhanced with large file handling capabilities through a new AudioChunker class and refactored utilities module. Previously, files larger than 150MB could cause memory exhaustion or API timeouts during transcription.
The new --use-chunking flag enables automatic splitting of large files at natural silence points, processes each chunk independently with transcript caching for fast reruns, then reassembles them while preserving chapter metadata and other tags. This enables monkeyplug to process large audiobooks or podcasts with the added benefit of optional parallel encoding (--parallel-encoding) that can significantly reduce processing time on multi-core systems.
In order to be efficient I wanted to split this out into audio_chunker.py in the event that you were not interested in this feature so that I can more easily backport to my fork. In doing this work I noticed some duplication and felt that it made sense to move some common functionality into an utilities.py
This feature branch has the work from my previous branch plus all this new stuff so linking a diff doesn't make a lot of sense.
Can find this work here: https://github.com/stratus-ss/monkeyplug/tree/feature/file-chunking
NOTE: The tests were done with Ai and human-in-the-loop