Updated documentation

author: Yann Collet <yann.collet.73@gmail.com> 2015-03-30 17:32:21 (GMT)
committer: Yann Collet <yann.collet.73@gmail.com> 2015-03-30 17:32:21 (GMT)
commit: 44793b8be9f18bb51f524b3a210de11bb0df6654 (patch)
tree: 5452e812fe46d117ceba25514e891d35287b1a4e
parent: b93f629681ad3245a09add28e4d0b2e43bcde58a (diff)
download: lz4-44793b8be9f18bb51f524b3a210de11bb0df6654.zip
lz4-44793b8be9f18bb51f524b3a210de11bb0df6654.tar.gz
lz4-44793b8be9f18bb51f524b3a210de11bb0df6654.tar.bz2
2 files changed, 27 insertions, 44 deletions
diff --git a/README.md b/README.md
index f960e7d..275085e 100644
--- a/README.md
+++ b/README.md
@@ -20,41 +20,21 @@ A high compression derivative, called LZ4_HC, is also provided. It trades CPU ti
 Benchmarks
 -------------------------
 
-The benchmark uses the [Open-Source Benchmark program by m^2 (v0.14.2)](http://encode.ru/threads/1371-Filesystem-benchmark?p=33548&viewfull=1#post33548) compiled with GCC v4.6.1 on Linux Ubuntu 64-bits v11.10,
-The reference system uses a Core i5-3340M @2.7GHz.
+The benchmark uses the [Open-Source Benchmark program by m^2 (v0.14.3)](http://encode.ru/threads/1371-Filesystem-benchmark?p=33548&viewfull=1#post33548) compiled with GCC v4.8.2 on Linux Mint 64-bits v17.
+The reference system uses a Core i5-4300U @1.9GHz.
 Benchmark evaluates the compression of reference [Silesia Corpus](http://sun.aei.polsl.pl/~sdeor/index.php?page=silesia) in single-thread mode.
 
-<table>
-  <tr>
-    <th>Compressor</th><th>Ratio</th><th>Compression</th><th>Decompression</th>
-  </tr>
-  <tr>
-    <th>LZ4 (r101)</th><th>2.084</th><th>422 MB/s</th><th>1820 MB/s</th>
-  </tr>
-  <tr>
-    <th>LZO 2.06</th><th>2.106</th><th>414 MB/s</th><th>600 MB/s</th>
-  </tr>
-  <tr>
-    <th>QuickLZ 1.5.1b6</th><th>2.237</th><th>373 MB/s</th><th>420 MB/s</th>
-  </tr>
-  <tr>
-    <th>Snappy 1.1.0</th><th>2.091</th><th>323 MB/s</th><th>1070 MB/s</th>
-  </tr>
-  <tr>
-    <th>LZF</th><th>2.077</th><th>270 MB/s</th><th>570 MB/s</th>
-  </tr>
-  <tr>
-    <th>zlib 1.2.8 -1</th><th>2.730</th><th>65 MB/s</th><th>280 MB/s</th>
-  </tr>
-  <tr>
-    <th>LZ4 HC (r101)</th><th>2.720</th><th>25 MB/s</th><th>2080 MB/s</th>
-  </tr>
-  <tr>
-    <th>zlib 1.2.8 -6</th><th>3.099</th><th>21 MB/s</th><th>300 MB/s</th>
-  </tr>
-</table>
-
-The LZ4 block compression format is detailed within [lz4_block_format.txt](lz4_block_format.txt).
+|  Compressor       | Ratio   | Compression | Decompression |
+|  ----------       | -----   | ----------- | ------------- |
+|**LZ4 (r129)**     |  2.101  |**385 MB/s** |**1850 MB/s**  |
+|  LZO 2.06         |  2.108  |  350 MB/s   |   510 MB/s    |
+|  QuickLZ 1.5.1.b6 |  2.238  |  320 MB/s   |   380 MB/s    |
+|  Snappy 1.1.0     |  2.091  |  250 MB/s   |   960 MB/s    |
+|  zlib 1.2.8 -1    |  2.730  |   59 MB/s   |   250 MB/s    |
+|**LZ4 HC (r129)**  |**2.720**|   22 MB/s   |**1830 MB/s**  |
+|  zlib 1.2.8 -6    |  3.099  |   18 MB/s   |   270 MB/s    |
+
+The LZ4 block compression format is detailed within [lz4_Block_format](lz4_Block_format.md).
 
 For streaming unknown amount of data and compress files of any size, a frame format has been published, and can be consulted within the file LZ4_Frame_Format.html .
 
diff --git a/lz4_Block_format.md b/lz4_Block_format.md
index e248fd9..b933a6a 100644
--- a/lz4_Block_format.md
+++ b/lz4_Block_format.md
@@ -1,10 +1,9 @@
 LZ4 Block Format Description
 ============================
-Last revised: 2015-03-26;
+Last revised: 2015-03-26.
 Author : Yann Collet
 
 
-
 This small specification intents to provide enough information
 to anyone willing to produce LZ4-compatible compressed data blocks
 using any programming language.
@@ -26,7 +25,8 @@ on implementation details of the compressor, and vice versa.
 Compressed block format
 -----------------------
 An LZ4 compressed block is composed of sequences.
-Schematically, a sequence is a suite of literals, followed by a match copy.
+A sequence is a suite of literals (not-compressed bytes),
+followed by a match copy.
 
 Each sequence starts with a token.
 The token is a one byte value, separated into two 4-bits fields.
@@ -35,14 +35,14 @@ Therefore each field ranges from 0 to 15.
 
 The first field uses the 4 high-bits of the token.
 It provides the length of literals to follow.
-(Note : a literal is a not-compressed byte).
+
 If the field value is 0, then there is no literal.
 If it is 15, then we need to add some more bytes to indicate the full length.
-Each additionnal byte then represent a value from 0 to 255,
+Each additional byte then represent a value from 0 to 255,
 which is added to the previous value to produce a total length.
 When the byte value is 255, another byte is output.
 There can be any number of bytes following the token. There is no "size limit".
-(Sidenote this is why a not-compressible input block is expanded by 0.4%).
+(Side note : this is why a not-compressible input block is expanded by 0.4%).
 
 Example 1 : A length of 48 will be represented as :
 - 15 : value for the 4-bits High field
@@ -65,7 +65,8 @@ It's possible that there are zero literal.
 Following the literals is the match copy operation.
 
 It starts by the offset.
-This is a 2 bytes value, in little endian format.
+This is a 2 bytes value, in little endian format
+(the 1st byte is the "low" byte, the 2nd one is the "high" byte).
 
 The offset represents the position of the match to be copied from.
 1 means "current position - 1 byte".
@@ -95,9 +96,12 @@ Parsing restrictions
 -----------------------
 There are specific parsing rules to respect in order to remain compatible
 with assumptions made by the decoder :
-1) The last 5 bytes are always literals
-2) The last match must start at least 12 bytes before end of block
-Consequently, a block with less than 13 bytes cannot be compressed.
+
+1. The last 5 bytes are always literals
+2. The last match must start at least 12 bytes before end of block.
+   
+   Consequently, a block with less than 13 bytes cannot be compressed.
+
 These rules are in place to ensure that the decoder
 will never read beyond the input buffer, nor write beyond the output buffer.
 
@@ -118,4 +122,3 @@ or full optimal parsing.
 All these trade-off offer distinctive speed/memory/compression advantages.
 Whatever the method used by the compressor, its result will be decodable
 by any LZ4 decoder if it follows the format specification described above.
-
author	Yann Collet <yann.collet.73@gmail.com>	2015-03-30 17:32:21 (GMT)
committer	Yann Collet <yann.collet.73@gmail.com>	2015-03-30 17:32:21 (GMT)
commit	44793b8be9f18bb51f524b3a210de11bb0df6654 (patch)
tree	5452e812fe46d117ceba25514e891d35287b1a4e
parent	b93f629681ad3245a09add28e4d0b2e43bcde58a (diff)
download	lz4-44793b8be9f18bb51f524b3a210de11bb0df6654.zip lz4-44793b8be9f18bb51f524b3a210de11bb0df6654.tar.gz lz4-44793b8be9f18bb51f524b3a210de11bb0df6654.tar.bz2