diff options
author | Yann Collet <cyan@fb.com> | 2018-02-24 19:47:53 (GMT) |
---|---|---|
committer | Yann Collet <cyan@fb.com> | 2018-02-24 19:47:53 (GMT) |
commit | 7173a631db61ab9535bd0d6e5e00e9dc081d4df3 (patch) | |
tree | 4701389655ac1f1b10eea0e38aa7fcd6fc37336c /INSTALL | |
parent | 99c26729b59d0389734973a9e3c55f7ef8408efb (diff) | |
download | lz4-7173a631db61ab9535bd0d6e5e00e9dc081d4df3.zip lz4-7173a631db61ab9535bd0d6e5e00e9dc081d4df3.tar.gz lz4-7173a631db61ab9535bd0d6e5e00e9dc081d4df3.tar.bz2 |
edge case : compress up to end-mflimit (12 bytes)
The LZ4 block format specification
states that the last match must start
at a minimum distance of 12 bytes from the end of the block.
However, out of an abundance of caution,
the reference implementation would actually stop searching matches
at 13 bytes from the end of the block.
This patch fixes this small detail.
The new version is now able to properly compress a limit case
such as `aaaaaaaabaaa\n`
as reported by Gao Xiang (@hsiangkao).
Obviously, it doesn't change a lot of things.
This is just one additional match candidate per block, with a maximum match length of 7 (since last 5 bytes must remain literals).
With default policy, blocks are 4 MB long, so it doesn't happen too often
Compressing silesia.tar at default level 1 saves 5 bytes (100930101 -> 100930096).
At max level 12, it saves a grand 16 bytes (77389871 -> 77389855).
The impact is a bit more visible when blocks are smaller, hence more numerous.
For example, compressing silesia with blocks of 64 KB (using -12 -B4D) saves 543 bytes (77304583 -> 77304040).
So the smaller the packet size, the more visible the impact.
And it happens we have a ton of scenarios with little blocks using LZ4 compression ...
And a useless "hooray" sidenote :
the patch improves the LZ4 compression record of silesia (using -12 -B7D --no-frame-crc) by 16 bytes (77270672 -> 77270656)
and the record on enwik9 by 44 bytes (371680396 -> 371680352) (previously claimed by [smallz4](http://create.stephan-brumme.com/smallz4/) ).
Diffstat (limited to 'INSTALL')
0 files changed, 0 insertions, 0 deletions