From b93f629681ad3245a09add28e4d0b2e43bcde58a Mon Sep 17 00:00:00 2001
From: Yann Collet <yann.collet.73@gmail.com>
Date: Mon, 30 Mar 2015 18:31:55 +0100
Subject: changed file name

---
 lz4_Block_format.md  | 121 +++++++++++++++++++++++++++++++++++++++++++++++++++
 lz4_block_format.txt | 121 ---------------------------------------------------
 2 files changed, 121 insertions(+), 121 deletions(-)
 create mode 100644 lz4_Block_format.md
 delete mode 100644 lz4_block_format.txt

diff --git a/lz4_Block_format.md b/lz4_Block_format.md
new file mode 100644
index 0000000..e248fd9
--- /dev/null
+++ b/lz4_Block_format.md
@@ -0,0 +1,121 @@
+LZ4 Block Format Description
+============================
+Last revised: 2015-03-26;
+Author : Yann Collet
+
+
+
+This small specification intents to provide enough information
+to anyone willing to produce LZ4-compatible compressed data blocks
+using any programming language.
+
+LZ4 is an LZ77-type compressor with a fixed, byte-oriented encoding.
+The most important design principle behind LZ4 is simplicity.
+It helps to create an easy to read and maintain source code.
+It also helps later on for optimizations, compactness, and speed.
+There is no entropy encoder back-end nor framing layer.
+The latter is assumed to be handled by other parts of the system.
+
+This document only describes the block format,
+not how the LZ4 compressor nor decompressor actually work.
+The correctness of the decompressor should not depend
+on implementation details of the compressor, and vice versa.
+
+
+
+Compressed block format
+-----------------------
+An LZ4 compressed block is composed of sequences.
+Schematically, a sequence is a suite of literals, followed by a match copy.
+
+Each sequence starts with a token.
+The token is a one byte value, separated into two 4-bits fields.
+Therefore each field ranges from 0 to 15.
+
+
+The first field uses the 4 high-bits of the token.
+It provides the length of literals to follow.
+(Note : a literal is a not-compressed byte).
+If the field value is 0, then there is no literal.
+If it is 15, then we need to add some more bytes to indicate the full length.
+Each additionnal byte then represent a value from 0 to 255,
+which is added to the previous value to produce a total length.
+When the byte value is 255, another byte is output.
+There can be any number of bytes following the token. There is no "size limit".
+(Sidenote this is why a not-compressible input block is expanded by 0.4%).
+
+Example 1 : A length of 48 will be represented as :
+- 15 : value for the 4-bits High field
+- 33 : (=48-15) remaining length to reach 48
+
+Example 2 : A length of 280 will be represented as :
+- 15  : value for the 4-bits High field
+- 255 : following byte is maxed, since 280-15 >= 255
+- 10  : (=280 - 15 - 255) ) remaining length to reach 280
+
+Example 3 : A length of 15 will be represented as :
+- 15 : value for the 4-bits High field
+- 0  : (=15-15) yes, the zero must be output
+
+Following the token and optional length bytes, are the literals themselves.
+They are exactly as numerous as previously decoded (length of literals).
+It's possible that there are zero literal.
+
+
+Following the literals is the match copy operation.
+
+It starts by the offset.
+This is a 2 bytes value, in little endian format.
+
+The offset represents the position of the match to be copied from.
+1 means "current position - 1 byte".
+The maximum offset value is 65535, 65536 cannot be coded.
+Note that 0 is an invalid value, not used. 
+
+Then we need to extract the match length.
+For this, we use the second token field, the low 4-bits.
+Value, obviously, ranges from 0 to 15.
+However here, 0 means that the copy operation will be minimal.
+The minimum length of a match, called minmatch, is 4. 
+As a consequence, a 0 value means 4 bytes, and a value of 15 means 19+ bytes.
+Similar to literal length, on reaching the highest possible value (15), 
+we output additional bytes, one at a time, with values ranging from 0 to 255.
+They are added to total to provide the final match length.
+A 255 value means there is another byte to read and add.
+There is no limit to the number of optional bytes that can be output this way.
+(This points towards a maximum achievable compression ratio of ~250).
+
+With the offset and the matchlength,
+the decoder can now proceed to copy the data from the already decoded buffer.
+On decoding the matchlength, we reach the end of the compressed sequence,
+and therefore start another one.
+
+
+Parsing restrictions
+-----------------------
+There are specific parsing rules to respect in order to remain compatible
+with assumptions made by the decoder :
+1) The last 5 bytes are always literals
+2) The last match must start at least 12 bytes before end of block
+Consequently, a block with less than 13 bytes cannot be compressed.
+These rules are in place to ensure that the decoder
+will never read beyond the input buffer, nor write beyond the output buffer.
+
+Note that the last sequence is also incomplete,
+and stops right after literals.
+
+
+Additional notes
+-----------------------
+There is no assumption nor limits to the way the compressor
+searches and selects matches within the source data block.
+It could be a fast scan, a multi-probe, a full search using BST,
+standard hash chains or MMC, well whatever.
+
+Advanced parsing strategies can also be implemented, such as lazy match,
+or full optimal parsing.
+
+All these trade-off offer distinctive speed/memory/compression advantages.
+Whatever the method used by the compressor, its result will be decodable
+by any LZ4 decoder if it follows the format specification described above.
+
diff --git a/lz4_block_format.txt b/lz4_block_format.txt
deleted file mode 100644
index e248fd9..0000000
--- a/lz4_block_format.txt
+++ /dev/null
@@ -1,121 +0,0 @@
-LZ4 Block Format Description
-============================
-Last revised: 2015-03-26;
-Author : Yann Collet
-
-
-
-This small specification intents to provide enough information
-to anyone willing to produce LZ4-compatible compressed data blocks
-using any programming language.
-
-LZ4 is an LZ77-type compressor with a fixed, byte-oriented encoding.
-The most important design principle behind LZ4 is simplicity.
-It helps to create an easy to read and maintain source code.
-It also helps later on for optimizations, compactness, and speed.
-There is no entropy encoder back-end nor framing layer.
-The latter is assumed to be handled by other parts of the system.
-
-This document only describes the block format,
-not how the LZ4 compressor nor decompressor actually work.
-The correctness of the decompressor should not depend
-on implementation details of the compressor, and vice versa.
-
-
-
-Compressed block format
------------------------
-An LZ4 compressed block is composed of sequences.
-Schematically, a sequence is a suite of literals, followed by a match copy.
-
-Each sequence starts with a token.
-The token is a one byte value, separated into two 4-bits fields.
-Therefore each field ranges from 0 to 15.
-
-
-The first field uses the 4 high-bits of the token.
-It provides the length of literals to follow.
-(Note : a literal is a not-compressed byte).
-If the field value is 0, then there is no literal.
-If it is 15, then we need to add some more bytes to indicate the full length.
-Each additionnal byte then represent a value from 0 to 255,
-which is added to the previous value to produce a total length.
-When the byte value is 255, another byte is output.
-There can be any number of bytes following the token. There is no "size limit".
-(Sidenote this is why a not-compressible input block is expanded by 0.4%).
-
-Example 1 : A length of 48 will be represented as :
-- 15 : value for the 4-bits High field
-- 33 : (=48-15) remaining length to reach 48
-
-Example 2 : A length of 280 will be represented as :
-- 15  : value for the 4-bits High field
-- 255 : following byte is maxed, since 280-15 >= 255
-- 10  : (=280 - 15 - 255) ) remaining length to reach 280
-
-Example 3 : A length of 15 will be represented as :
-- 15 : value for the 4-bits High field
-- 0  : (=15-15) yes, the zero must be output
-
-Following the token and optional length bytes, are the literals themselves.
-They are exactly as numerous as previously decoded (length of literals).
-It's possible that there are zero literal.
-
-
-Following the literals is the match copy operation.
-
-It starts by the offset.
-This is a 2 bytes value, in little endian format.
-
-The offset represents the position of the match to be copied from.
-1 means "current position - 1 byte".
-The maximum offset value is 65535, 65536 cannot be coded.
-Note that 0 is an invalid value, not used. 
-
-Then we need to extract the match length.
-For this, we use the second token field, the low 4-bits.
-Value, obviously, ranges from 0 to 15.
-However here, 0 means that the copy operation will be minimal.
-The minimum length of a match, called minmatch, is 4. 
-As a consequence, a 0 value means 4 bytes, and a value of 15 means 19+ bytes.
-Similar to literal length, on reaching the highest possible value (15), 
-we output additional bytes, one at a time, with values ranging from 0 to 255.
-They are added to total to provide the final match length.
-A 255 value means there is another byte to read and add.
-There is no limit to the number of optional bytes that can be output this way.
-(This points towards a maximum achievable compression ratio of ~250).
-
-With the offset and the matchlength,
-the decoder can now proceed to copy the data from the already decoded buffer.
-On decoding the matchlength, we reach the end of the compressed sequence,
-and therefore start another one.
-
-
-Parsing restrictions
------------------------
-There are specific parsing rules to respect in order to remain compatible
-with assumptions made by the decoder :
-1) The last 5 bytes are always literals
-2) The last match must start at least 12 bytes before end of block
-Consequently, a block with less than 13 bytes cannot be compressed.
-These rules are in place to ensure that the decoder
-will never read beyond the input buffer, nor write beyond the output buffer.
-
-Note that the last sequence is also incomplete,
-and stops right after literals.
-
-
-Additional notes
------------------------
-There is no assumption nor limits to the way the compressor
-searches and selects matches within the source data block.
-It could be a fast scan, a multi-probe, a full search using BST,
-standard hash chains or MMC, well whatever.
-
-Advanced parsing strategies can also be implemented, such as lazy match,
-or full optimal parsing.
-
-All these trade-off offer distinctive speed/memory/compression advantages.
-Whatever the method used by the compressor, its result will be decodable
-by any LZ4 decoder if it follows the format specification described above.
-
-- 
cgit v0.12