diff options
author | Yann Collet <cyan@fb.com> | 2019-04-11 20:55:35 (GMT) |
---|---|---|
committer | Yann Collet <cyan@fb.com> | 2019-04-11 20:55:35 (GMT) |
commit | 013fee5665cbf03113c1c2e78d5b50fa9663b306 (patch) | |
tree | 89853e5b6303d9c87c70398d9349d69ebf3a10be /doc/lz4_Block_format.md | |
parent | 5580f4333aad9fb9b2cef5c8da0e90feb030db63 (diff) | |
download | lz4-013fee5665cbf03113c1c2e78d5b50fa9663b306.zip lz4-013fee5665cbf03113c1c2e78d5b50fa9663b306.tar.gz lz4-013fee5665cbf03113c1c2e78d5b50fa9663b306.tar.bz2 |
updated block format with more details on MF_LIMIT
aka distance from end of block for the last match.
Diffstat (limited to 'doc/lz4_Block_format.md')
-rw-r--r-- | doc/lz4_Block_format.md | 35 |
1 files changed, 23 insertions, 12 deletions
diff --git a/doc/lz4_Block_format.md b/doc/lz4_Block_format.md index 2fb4c19..4344e9b 100644 --- a/doc/lz4_Block_format.md +++ b/doc/lz4_Block_format.md @@ -1,6 +1,6 @@ LZ4 Block Format Description ============================ -Last revised: 2018-12-30. +Last revised: 2019-03-30. Author : Yann Collet @@ -109,18 +109,29 @@ End of block restrictions ----------------------- There are specific rules required to terminate a block. -1. The last sequence only contains literals. The block ends right after them. -1. The last 5 bytes of input are always literals. - Therefore, the last sequence contains at least 5 bytes, - or all input bytes if input is smaller than 5 bytes - (empty input can be represented with a zero byte, - interpreted as a token without literal and without a match). -2. The last match must start at least 12 bytes before the end of block. +1. The last sequence contains only literals. + The block ends right after them. +2. The last 5 bytes of input are always literals. + Therefore, the last sequence contains at least 5 bytes. + - Special : if input is smaller than 5 bytes, + there is only one sequence, it contains the whole input as literals. + Empty input can be represented with a zero byte, + interpreted as a final token without literal and without a match. +3. The last match must start at least 12 bytes before the end of block. The last match is part of the penultimate sequence. - It is followed by the last sequence, which only contains literals. - Note that, as a consequence, blocks < 13 bytes cannot be compressed. - -These rules are in place to ensure that a compatible decoder + It is followed by the last sequence, which contains only literals. + - Note that, as a consequence, + an independent block < 13 bytes cannot be compressed, + because the match must copy "something", + so it needs at least one prior byte. + - When a block can reference data from another block, + it can start immediately with a match and no literal, + so a block of 12 bytes can be compressed. + +When a block does not respect these end conditions, +a conformant decoder is allowed to reject the block as incorrect. + +These rules are in place to ensure that a conformant decoder can be designed for speed, issuing speculatively instructions, while never reading nor writing beyond provided I/O buffers. |