summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorYann Collet <cyan@fb.com>2017-06-06 18:20:36 (GMT)
committerYann Collet <cyan@fb.com>2017-06-06 18:20:36 (GMT)
commit7e15e240aba842020a2f6e86f35e71cbacdf237d (patch)
tree029622918f1f24d328277e05f56bd5ad85f0c6f8
parent03d8586fca653cb10379a6a4f310b0724d9ac6ac (diff)
downloadlz4-7e15e240aba842020a2f6e86f35e71cbacdf237d.zip
lz4-7e15e240aba842020a2f6e86f35e71cbacdf237d.tar.gz
lz4-7e15e240aba842020a2f6e86f35e71cbacdf237d.tar.bz2
added a paragraph on overlap matches
-rw-r--r--doc/lz4_Block_format.md16
1 files changed, 12 insertions, 4 deletions
diff --git a/doc/lz4_Block_format.md b/doc/lz4_Block_format.md
index 0f6a5ba..4e39b41 100644
--- a/doc/lz4_Block_format.md
+++ b/doc/lz4_Block_format.md
@@ -90,10 +90,18 @@ A 255 value means there is another byte to read and add.
There is no limit to the number of optional bytes that can be output this way.
(This points towards a maximum achievable compression ratio of about 250).
-With the offset and the matchlength,
-the decoder can now proceed to copy the data from the already decoded buffer.
-On decoding the matchlength, we reach the end of the compressed sequence,
-and therefore start another one.
+Decoding the matchlength reaches the end of current sequence.
+Next byte will be the start of another sequence.
+But before moving to next sequence,
+it's time to use the decoded match position and length.
+The decoder copies matchlength bytes from match position to current position.
+
+In some cases, matchlength is larger than offset.
+Therefore, match pos + match length > current pos,
+which means that later bytes to copy are not yet decoded.
+This is called an "overlap match", and must be handled with special care.
+The most common case is an offset of 1,
+meaning the last byte is repeated matchlength times.
Parsing restrictions