Talk:WikiBrowse

From OLPC
Revision as of 04:23, 26 May 2008 by Wade (talk | contribs) (Compression)
Jump to: navigation, search

Compression

There are better choices than bzip2. The obvious one is 7zip, which is excellent for both decompression speed and compression ability according to this comparison chart. Note that higher compression levels mean faster decompression. AlbertCahalan 00:07, 26 May 2008 (EDT)


We tried 7zip. It didn't do any better than bzip2 for our archive, which is the latest revision of every article, but I was shocked to see that it produces archives 20x smaller for the archive that is every revision of every article.

We aren't able to interchange compression formats easily, since the code is dependent on an indexer that can create an index of articles into compression blocks, and then decode an individual compression block quickly. This would need to be ported per-format. Cjb 01:59, 26 May 2008 (EDT)


I took the .xo file and grabbed the compressed file out of that and ran it though lzma -9 The .bz2 file was 83,749k and the .lzma was 70,426. lzma -9 may be a bit hefty for the XO take a bit of memory on decompression. -7 would still be a gain. 121.72.128.89 03:41, 26 May 2008 (EDT) (which is lerc who really should make an account).

Thanks for the hard numbers - it would be great to be able to include more articles, but porting the block decompression code is a large undertaking. Basically we need the ability to seek to anywhere in an archive and decompress a few bytes. Maybe this is something LZMA already has, or maybe we can talk to the Wikipedia on iPhone guy about attempting it? wade 04:23, 26 May 2008 (EDT)