Add dictionary random access example

author: Nick Terrell <terrelln@fb.com> 2016-11-10 00:20:47 (GMT)
committer: Nick Terrell <terrelln@fb.com> 2016-11-10 01:39:56 (GMT)
commit: 94917c9a04ce08fcdb6b465b4aff38d2d82053aa (patch)
tree: 4d762b18e40590e001f0579726160c6bb9367e96 /examples/dictionaryRandomAccess.md
parent: bd88e4007b7e3eddd58e2c76c39b5bb650b5cb20 (diff)
download: lz4-94917c9a04ce08fcdb6b465b4aff38d2d82053aa.zip
lz4-94917c9a04ce08fcdb6b465b4aff38d2d82053aa.tar.gz
lz4-94917c9a04ce08fcdb6b465b4aff38d2d82053aa.tar.bz2
1 files changed, 67 insertions, 0 deletions
diff --git a/examples/dictionaryRandomAccess.md b/examples/dictionaryRandomAccess.md
new file mode 100644
index 0000000..53d825d
--- /dev/null
+++ b/examples/dictionaryRandomAccess.md
@@ -0,0 +1,67 @@
+# LZ4 API Example : Dictionary Random Access
+
+`dictionaryRandomAccess.c` is LZ4 API example which implements dictionary compression and random access decompression.
+
+Please note that the output file is not compatible with lz4frame and is platform dependent.
+
+
+## What's the point of this example ?
+
+ - Dictionary based compression for homogenous files.
+ - Random access to compressed blocks.
+
+
+## How the compression works
+
+Reads the dictionary from a file, and uses it as the history for each block.
+This allows each block to be independent, but maintains compression ratio.
+
+```
+    Dictionary
+         +
+         |
+         v
+    +---------+
+    | Block#1 |
+    +----+----+
+         |
+         v
+      {Out#1}
+
+
+    Dictionary
+         +
+         |
+         v
+    +---------+
+    | Block#2 |
+    +----+----+
+         |
+         v
+      {Out#2}
+```
+
+After writing the magic bytes `TEST` and then the compressed blocks, write out the jump table.
+The last 4 bytes is an integer containing the number of blocks in the stream.
+If there are `N` blocks, then just before the last 4 bytes is `N + 1` 4 byte integers containing the offsets at the beginning and end of each block.
+Let `Offset#K` be the total number of bytes written after writing out `Block#K` *including* the magic bytes for simplicity.
+
+```
++------+---------+     +---------+---+----------+     +----------+-----+
+| TEST | Block#1 | ... | Block#N | 4 | Offset#1 | ... | Offset#N | N+1 |
++------+---------+     +---------+---+----------+     +----------+-----+
+```
+
+## How the decompression works
+
+Decompression will do reverse order.
+
+ - Seek to the last 4 bytes of the file and read the number of offsets.
+ - Read each offset into an array.
+ - Seek to the first block containing data we want to read.
+   We know where to look because we know each block contains a fixed amount of uncompressed data, except possibly the last.
+ - Decompress it and write what data we need from it to the file.
+ - Read the next block.
+ - Decompress it and write that page to the file.
+
+Continue these procedure until all the required data has been read.
author	Nick Terrell <terrelln@fb.com>	2016-11-10 00:20:47 (GMT)
committer	Nick Terrell <terrelln@fb.com>	2016-11-10 01:39:56 (GMT)
commit	94917c9a04ce08fcdb6b465b4aff38d2d82053aa (patch)
tree	4d762b18e40590e001f0579726160c6bb9367e96 /examples/dictionaryRandomAccess.md
parent	bd88e4007b7e3eddd58e2c76c39b5bb650b5cb20 (diff)
download	lz4-94917c9a04ce08fcdb6b465b4aff38d2d82053aa.zip lz4-94917c9a04ce08fcdb6b465b4aff38d2d82053aa.tar.gz lz4-94917c9a04ce08fcdb6b465b4aff38d2d82053aa.tar.bz2