Releases: nicolas-comerci/precomp-cpp
Precomp v0.4.8 Neo v0.4.2a
WARNING: in case anybody needs it. This is the last version that will remain compatible (other than maybe small bugfixes) with mainline Precomp v0.4.8.
Bugfixes and minor improvements
Remember on last release when I said most of the bugs of Precomp itself, not of any specific format handler, were already fixed? Well...
- Fixed massive memory leaks problems because I forgot to add virtual destructors to the classes of the new WIP Plugin Architecture
- Fixed crashes caused by the design of PassthroughStream, which is used to do recursive recompression without needing intermediate temporary files or memory buffers, which made it so derived classes could easily run into race conditions because of how a thread was spawned on construction, running a a function which could be capturing members from the derived class before its initializer list and constructor were executed. Now thread is not spawned on creation, a member function must be run to kick it off which makes the problem impossible for the most part.
- Avoided some unnecessary copies during handling of preflate precompression results (when they were small enough to be in memory) and penalty bytes, however this doesn't really seem to have yielded any significant speedup as those were in memory buffers, mostly very small
Behavior changes for uncompressed data
"Uncompressed data" is outputted more aggressively.
As soon as 100mb(configurable) or a stream is successfully found, any remaining "uncompressed data" (which is sort of a misnomer, we consider "uncompressed data" any data we couldn't precompress, it could very well be compressed data in an unsupported format like LZMA or something, but I digress) is outputted.
Previously "uncompressed data" would be unnecessarily hogged by Precomp, waiting unnecessarily for recursion on a stream was finished, or accumulating indefinitely, which could be a problem if for example you fed data containing a 4Gb+ LZMA stream to Precomp and you had Raw Deflate (brute mode) on, Precomp would have to check and fail detection byte by byte of all that 4Gb+ of data before outputting even a single byte.
This was really notable when using the 7zip Plugin, as compression would appear stuck for a long time (which could mislead an user into thinking it effectively hanged/got deadlocked) before continuing. Of course its also wasteful because all that time waiting 7zip could have been actually compressing the data instead of waiting idly by.
Plugin architecture progress
- Recursion now also handled by Precomp on recompression, not the format handlers, while this was already the case for precompression.
- This is HUGE for the Plugin Architecture. It means developers of format handler plugins won't need to look inside of streams during precompression or worry about Precomp having found streams inside of the stream they are processing.
They get such a powerful support for free, without even having to worry about it.
- This is HUGE for the Plugin Architecture. It means developers of format handler plugins won't need to look inside of streams during precompression or worry about Precomp having found streams inside of the stream they are processing.
- Similar refactoring attempted for Penalty Bytes but I only managed to make Precomp handle the recompression, but simplified needed code on precompression
- The newly simplified and unified code for Penalty Bytes handling on precompression might be a little less permissive, specially for BZip2 streams, so this version might mean worse compression gains on some streams, but as explained below, this won't be for long.
- Properly handling Penalty Bytes on precompression outside the format handlers (Only GIF and BZip2 use it currently) was frankly pretty complicated, would require a lot of changes in the code and its not worth it because:
- Penalty Bytes are going away soon but that will require breaking compatibility with mainline Precomp v0.4.8
- Penalty Bytes are essentially small patches for whenever Precomp is not able to recompress the precompressed data into the original stream exactly.
The problem is that it's a VERY naïve patching system and for example a single byte added or missing into a 100gb stream that is 99.9% matching data would fail because it just handles bytes getting changed, not added or removed.
Schnaader probably just added it this way because it was quick to code as a proof of concept, and it does sometimes allow otherwise failing streams to work, but doing more sophisticated patch with xdelta or something similar would yield much better results, which is what I plan for the near future.
- Penalty Bytes are essentially small patches for whenever Precomp is not able to recompress the precompressed data into the original stream exactly.
Whats next
The next version, v0.5a, will break mainline Precomp v0.4.8 compatibility.
I have pretty much just ran out of meaningful things I can do without making some more radical changes to the precomp format handler headers left behind by the format handlers and the overall behavior of Precomp.
If you ask why don't I finish the Plugin architecture before breaking compatibility, which is something I could do, it's because some of the breaking changes will make some impositions on the design of the format handlers and thus of any hypothetical format handler plugins developed.
I don't see the value of giving anybody the tools to develop a plugin and immediately breaking it, I would rather wait a little more until I can be fairly certain that any plugin developed by someone, while it might need some updating alongside new Precomp Neo versions, it won't need essentially a complete rewrite.
Precomp v0.4.8 Neo v0.4.1a
No big highlight feature this time, just bugfixes.
- Fix for rare crash where Preflate would try to read past it's input data causing failures and crashes
- As such this bug doesn't actually have to do with Precomp itself, and was present since mainline Precomp v0.4.8 (or earlier)
- It's a very rudimentary fix that doesn't really get to the bottom of the issue, but at least prevents the crashes
- This one is fully on me: Fix for some single IDAT PNG files not being properly precompressed, that would precompress fine on Precomp v0.4.8, due to a bug introduced during Precomp Neo refactoring
- Fix some errors in memiostreams code, which I don't know if they were actually causing any issues, but the code was nonetheless corrected
I think this, in addition to verification added on Precomp Neo 0.4a previously, get us to a point where crashes, complete hangs or bad data should be very rare, so I consider this a milestone in terms of stability.
There are still problems related with specific formats and slowdowns and the like, and of course I don't assume we won't find any more issues in the future, but for now, most of the egregious problems seem to have been solved... I hope
Precomp v0.4.8 Neo v0.4a
Builtin verification
The greatest highlight of this version is the newly included builtin verification during precompression.
Now by default, LibPrecomp will immediately attempt to recompress a stream after precompressing it and proceed to do a checksum check.
If the check fails, the stream is rejected and precomp just continues it's precompression process as if the stream wasn't found.
This makes it much less likely for Precomp to end up producing PCF files that fail, crash, hang or just recompress bad data.
The bad about this, is that it takes some extra time, but in Precomp's current state, its very much worth to keep it on in my opinion.
If you are really worried about speed you can use the new -no-verify option, but again, probably not a good idea for real use with real data.
However for testing, -no-verify is really useful, and I do want to eventually just have Precomp do the right thing from the get go, without needing this verification step as a crutch, but yet again, for now my focus is on reliability.
The 7zip plugin also inherits this verification, which makes me much more comfortable with the prospect of it being used for real data.
I would still recommend using 7zip's own test/verification feature and in general being vigilant.
Other fixes
- Base64 format handler was busted, I probably broke at some point of all the refactoring I did, but apparently now works again, properly detecting and precompressing data without crashing...
- Fixed hang during precompression in some cases where Precomp wasn't properly handling reaching the end of a view of a chunk
Other interesting bits about this release + What's to come
During the development of this release I moved most of the format handlers code, each one responsible of handling a specific format, like Zip, Mp3, Jpg, etc, to something that is very close to a Plugin Architecture.
In some point in the near future, I expect to complete the needed work to actually have Precomp support additional plugins.
This would mean that potential contributors or even external users could add support to additional formats to Precomp, without having to worry about how Precomp works.
All they will need to care is to write some code that gets some data, determines if they can handle the data or not, if they do process the data, and return the processed data to Precomp.
The prospect is very exiting to me, from making it less daunting for contributors that might know how to handle a given format but don't want to become experts in Precomp code to contribute, to even maybe enabling Precomp for private use, like maybe some company sees some use for Precomp alongside a non-open format they could develop a Plugin for, or many other possibilities.
So stay tuned for this!
Precomp v0.4.8 Neo v0.3a + 7zip Plugin
7ZIP Plugin
The whole point of this release is the 7zip plugin, which should make Precomp more easily accessible for more users.
The usual caveats apply however, this is still beta software, doubly so for the 7zip plugin as its very new, you use it at your own risk, provided as-is, yadda yadda, I am not responsible for any data loss/corruption or absolutely nothing that might happen ever under any circumstance.
With that said, if you choose to create 7z archives using the Precomp plugin, 7zip has a handy "test archive" feature that will ensure the data integrity of the archive, that is, ensure the data can be recovered and has matching checksums to the original.
You should use this feature to double check no silent/undetected data corruption error happened during Precomp's processing.
To install the plugin, paste it into 7zip's folder inside a 'Codecs' subdirectory (ensure you use the appropriate dll for your Windows architecture). You might need to create the 'Codecs' subdirectory if it doesn't already exist.
To use it, you need to add f=precomp into "Parameters"

That will enable Precomp with it's default settings.
You can also use f=precomp:x0 which is equivalent, f=precomp:x1 which enables "intense mode" (raw ZLIB) or f=precomp:x2 which enables "brute mode" (raw DEFLATE streams) in addition to "intense mode".
In future versions I might add more options but for now that is all you get in terms of configuring Precomp on the plugin.
With this you can create an archive which's data is all prefiltered by Precomp. Extracting the archive is the same as extracting any other 7z archive, as long as you have the plugin installed.
Caveats, Known Issues
- Didn't add the needed code to send updates to 7zip as the data is being precompressed/recompressed, this makes 7zip report nonsensical progress percentages (I've seen 239% while compressing some files) and similarly nonsensical compression ratios sometimes.
-- This is mostly harmless so you can ignore it, will be solved in future versions - Precomp requires seekable input when precompressing, but 7zip only provides plugins with sequential access to input data.
-- Because of this, the Precomp plugin will dump the input data first into a file and process it from there, where it is able to seek freely.
-- This is okay most of the time. It might become a problem however if you happen to have little free space on your system's volume or if you want to archive a really big file. I have some ideas to fix/alleviate here, but for now, such is life, you'll have to deal with it. - It just uses LibPrecomp, so any file that crashes, hangs or results in corrupted data on the regular Precomp command line program, will also cause those issues here.
- 32bit version wasn't tested at all, should you have a problem with it feel free to submit an issue
Precomp v0.4.8 Neo v0.3a? Where is it?
v0.3a is only a LibPrecomp update, that adds/modifies absolutely nothing for the Precomp program.
As such I didn't even bother compiling those, if you want those, just grab v0.2a from this very page, it will be exactly the same.
Upgrade on LibPrecomp v0.3a
The upgrade on LibPrecomp v0.3a is the ability to change the directory where the temporary files are created.
This was necessary for the 7zip plugin, as when running without privilege elevation, it would be impossible to write temporary files into 7zip's folder (or System32 as its sometimes the working directory when running 7zip from the shell).
This allows the 7zip plugin to use the system's temporary directory to locate the temporary files.
We could wire this functionality into a command line parameter for Precomp, but as of now, there is no way to use it from the command line program, which is why I am not providing binaries for it.
Precomp v0.4.8 Neo v0.2a
Fixed recompression on Linux and Mac, so now those versions should be as functional as the Windows version.
Linux built with GCC, Mac with Clang and Windows with MSVC
Precomp v0.4.8 Neo v0.1a
First release of Precomp Neo!
Fully expect this to blow up on your face.
In fact, I am not adding Linux or Mac binaries here because they are useless for now and will recompress corrupted data every time.
The Windows binaries here, have been tested successfully on about 20 files worth about 14gb of compressed data.
So much testing and bugfixing is still needed, but if you want a sneek peak, here you go.
To use stdin/stdout you simply use stdin as input filename or stdout as output filename:
precomp -e -intense -brute -ostdout myfile.bin | cat - - > myfile.bin.pcf
7za x -so -txz myfile.bin.pcf.xz | precomp -ostdout -r stdin | cat - - > myfile.bin
For the dlltest program: (Pure C, dynamically linked to the dll)
dlltest p myfile.bin stdout | cat - - > myfile.bin.pcf
7za x -so -txz myfile.bin.pcf.xz | dlltest r stdin stdout | cat - - > myfile.bin