In boredom, I wrote a small VBScript document that wrote a file with the words to ’10 Green Bottles’, except you could set the initial number. Using 7500 as the initial number, I ended up with a 1.24MB text file, with the first 7500 lines of the song…
Anyway, as an experiment, I applied various compression methods to the file to see which one could make it the smallest. Here are my results:
(added RAR, GnomeZip and BZip2)
|Comparison of zip formats|
|Original file||.TXT||1272 KB||100%|
|7-Zip 2.30 beta 24||.ZIP||57.8KB||4.54%|
|7-Zip 2.30 beta 24||.GZ||57.8KB||4.54%|
|WinRAR 3.10 beta 2||.RAR||33.8KB||2.66%|
|7-Zip 2.30 beta 24||.bz2||26.0KB||2.04%|
|7-Zip 2.30 beta 24||.7Z||10.1KB||0.79%|
No, I couldn’t believe it either. I’ve always had good results from 7-Zip, particularly when using the 7z format. What annoys me is that 7-Zip is open source, released under the LGPL. Anyone could use this code in their own programs; so why do they not bother?
The performance of BZip2 also impressed me, though I was expecting more from the RAR format; even when optimised for text files, it still only equaled the performance of ICE. This is especially poor as WinRAR is Shareware and is one of only a handful of programs able to create RAR files. In all cases the compression was set to maximum.
The performance of the ICE format, which is apparently optimised for text files, is promising, but is still beaten by 7z.