diff options
author | Yann Collet <cyan@fb.com> | 2017-06-06 18:20:36 (GMT) |
---|---|---|
committer | Yann Collet <cyan@fb.com> | 2017-06-06 18:20:36 (GMT) |
commit | 7e15e240aba842020a2f6e86f35e71cbacdf237d (patch) | |
tree | 029622918f1f24d328277e05f56bd5ad85f0c6f8 /doc | |
parent | 03d8586fca653cb10379a6a4f310b0724d9ac6ac (diff) | |
download | lz4-7e15e240aba842020a2f6e86f35e71cbacdf237d.zip lz4-7e15e240aba842020a2f6e86f35e71cbacdf237d.tar.gz lz4-7e15e240aba842020a2f6e86f35e71cbacdf237d.tar.bz2 |
added a paragraph on overlap matches
Diffstat (limited to 'doc')
-rw-r--r-- | doc/lz4_Block_format.md | 16 |
1 files changed, 12 insertions, 4 deletions
diff --git a/doc/lz4_Block_format.md b/doc/lz4_Block_format.md index 0f6a5ba..4e39b41 100644 --- a/doc/lz4_Block_format.md +++ b/doc/lz4_Block_format.md @@ -90,10 +90,18 @@ A 255 value means there is another byte to read and add. There is no limit to the number of optional bytes that can be output this way. (This points towards a maximum achievable compression ratio of about 250). -With the offset and the matchlength, -the decoder can now proceed to copy the data from the already decoded buffer. -On decoding the matchlength, we reach the end of the compressed sequence, -and therefore start another one. +Decoding the matchlength reaches the end of current sequence. +Next byte will be the start of another sequence. +But before moving to next sequence, +it's time to use the decoded match position and length. +The decoder copies matchlength bytes from match position to current position. + +In some cases, matchlength is larger than offset. +Therefore, match pos + match length > current pos, +which means that later bytes to copy are not yet decoded. +This is called an "overlap match", and must be handled with special care. +The most common case is an offset of 1, +meaning the last byte is repeated matchlength times. Parsing restrictions |