summaryrefslogtreecommitdiffstats
path: root/lz4_block_format.txt
diff options
context:
space:
mode:
authorYann Collet <yann.collet.73@gmail.com>2015-03-26 18:58:19 (GMT)
committerYann Collet <yann.collet.73@gmail.com>2015-03-26 18:58:19 (GMT)
commitce71b073b5a4a9e2bdd78855f50ddc146baac1c5 (patch)
treebd552cc1b30993a7f0e3a329a5221abd2cd69466 /lz4_block_format.txt
parent1ba37f378404f912fffe265a9506147695e158db (diff)
downloadlz4-ce71b073b5a4a9e2bdd78855f50ddc146baac1c5.zip
lz4-ce71b073b5a4a9e2bdd78855f50ddc146baac1c5.tar.gz
lz4-ce71b073b5a4a9e2bdd78855f50ddc146baac1c5.tar.bz2
converted to markdown friendly syntax
Diffstat (limited to 'lz4_block_format.txt')
-rw-r--r--lz4_block_format.txt25
1 files changed, 13 insertions, 12 deletions
diff --git a/lz4_block_format.txt b/lz4_block_format.txt
index 2c424c5..e248fd9 100644
--- a/lz4_block_format.txt
+++ b/lz4_block_format.txt
@@ -1,6 +1,7 @@
-LZ4 Format Description
-Last revised: 2012-02-27
-Author : Y. Collet
+LZ4 Block Format Description
+============================
+Last revised: 2015-03-26;
+Author : Yann Collet
@@ -11,19 +12,19 @@ using any programming language.
LZ4 is an LZ77-type compressor with a fixed, byte-oriented encoding.
The most important design principle behind LZ4 is simplicity.
It helps to create an easy to read and maintain source code.
-It also helps later on for optimisations, compactness, and speed.
-There is no entropy encoder backend nor framing layer.
+It also helps later on for optimizations, compactness, and speed.
+There is no entropy encoder back-end nor framing layer.
The latter is assumed to be handled by other parts of the system.
-This document only describes the format,
+This document only describes the block format,
not how the LZ4 compressor nor decompressor actually work.
The correctness of the decompressor should not depend
on implementation details of the compressor, and vice versa.
--- Compressed block format --
-
+Compressed block format
+-----------------------
An LZ4 compressed block is composed of sequences.
Schematically, a sequence is a suite of literals, followed by a match copy.
@@ -90,8 +91,8 @@ On decoding the matchlength, we reach the end of the compressed sequence,
and therefore start another one.
--- Parsing restrictions --
-
+Parsing restrictions
+-----------------------
There are specific parsing rules to respect in order to remain compatible
with assumptions made by the decoder :
1) The last 5 bytes are always literals
@@ -104,8 +105,8 @@ Note that the last sequence is also incomplete,
and stops right after literals.
--- Additional notes --
-
+Additional notes
+-----------------------
There is no assumption nor limits to the way the compressor
searches and selects matches within the source data block.
It could be a fast scan, a multi-probe, a full search using BST,