Obviously, the purpose of this tool isn't to preserve 100% compatibility. Things like removing empty directories makes that clear.
But, why would you remove comments? Presumably, if those are there, they were added for a specific reason. And the author acknowledges the space savings are minimal.
xz is pretty universal across POSIX and clones though. It comes with any modern Linux distro, Busybox even has an .xz decompressor, so `tar xvJF file.tar.xz` does the right thing in *NIX land, which I presume includes MacOS with Brew.
For Windows systems, 7-zip (.7z, similar compression to .xz) is a free download for Windows 10, and Windows 11 can open up a .7z file with a simple double click.
.zip and .gz no longer need to be used here in 2026.
.zip is used as a seekable container with some compression. There is no replacement comparable in simplicity. 7z is overcomplicated, compressed tar is not seekable.
.gz/deflate is used when something very cheap and very fast is needed. xz/lzma is quite often too slow or requires too much memory even on decompression.
Compared to xz and even parallel xz, gzip and parallel gzip are just better if speed is more important. The compression is not superior but already good if you consider just the uncompressed data. For long term storage, it makes sense, to invest the extra time for better compression but if it's about transfer time, you might end up with a overall longer processing time instead of just a longer transfer time because of a worse compression ratio.
It's like with image formats: Pick the right one for your use case.
Do any formats using ZIP as the underlying format use ZIP comments for metadata? Unless there's a lot of compressors leaving "Zip file generated by MySuperZipper™" then I imagine any comments left were probably done for a good reason.
I'm not aware of any, but it wouldn't be insane to build a seekable deflate implementation by defining offsets in a zip comment. This would leave the zip file backwards compatible to usual decompression while allowing internal seeking within an individual file if the decompressor was aware of this index.
"It deletes empty folders" and "Let me know if this is a problem for you"
NEVER DO THAT. I know you meant well, but the first rule of any program is to NEVER automatically delete something without informing the user. NEVER. Users keep empty folders for structure, reminders, or placeholders because software will dump files into it later when it's run. If it was there when they zipped it up, it should be there when they unzip it. Otherwise they'll check the before and after and it will show some folders missing, create confusion, and the user will run off trying to find out if anything else is missing.
Example: A user zips up a program. Some programs are coded to look for a folder and dump files into it, if the folder is missing the program will fail. I've had that occasionally over the years. Not all programs will recreate a missing folder.
One thing I dislike about git is that it really does not support empty folders well. Even though they might make sense lot of time. Either now or for future. There is decent reasons to have empty folders.
Just kidding, I don't see how the overhead of the directory entry is even remotely enough to warrant removal. Most of the magic can be left to efficient DEFLATE compatible blocks and removing entries not in the central directory in the first place (ZIP files can support concatenation of new data so long as you re-write the central directory at the end of the file).
But, why would you remove comments? Presumably, if those are there, they were added for a specific reason. And the author acknowledges the space savings are minimal.
[1] https://github.com/fhanau/Efficient-Compression-Tool
Is there any point for (new) .bz2 archives in the era of Zstd?
It takes years for bzip2 be in every Linux Distro, and we _still_ doing gzip.
LZMA / xz tool are start to get more support, but they are nowhere near universal.
No idea when how long zstd will need.
For Windows systems, 7-zip (.7z, similar compression to .xz) is a free download for Windows 10, and Windows 11 can open up a .7z file with a simple double click.
.zip and .gz no longer need to be used here in 2026.
.gz/deflate is used when something very cheap and very fast is needed. xz/lzma is quite often too slow or requires too much memory even on decompression.
so no, .zip and .gz are very much needed in 2026.
you need python 3.14 for zstd.
yeah, this will inevitably break things. excluding those from the directory stripping shouldn't be too hard (TM)
[1] http://web.archive.org/web/20031018072659/http://msdn.micros...
"It deletes empty folders" and "Let me know if this is a problem for you"
NEVER DO THAT. I know you meant well, but the first rule of any program is to NEVER automatically delete something without informing the user. NEVER. Users keep empty folders for structure, reminders, or placeholders because software will dump files into it later when it's run. If it was there when they zipped it up, it should be there when they unzip it. Otherwise they'll check the before and after and it will show some folders missing, create confusion, and the user will run off trying to find out if anything else is missing.
Example: A user zips up a program. Some programs are coded to look for a folder and dump files into it, if the folder is missing the program will fail. I've had that occasionally over the years. Not all programs will recreate a missing folder.
Just kidding, I don't see how the overhead of the directory entry is even remotely enough to warrant removal. Most of the magic can be left to efficient DEFLATE compatible blocks and removing entries not in the central directory in the first place (ZIP files can support concatenation of new data so long as you re-write the central directory at the end of the file).