Add document for "Line by Line Text Compression" example

author: Takayuki MATSUOKA <takayuki.matsuoka@gmail.com> 2015-03-24 22:52:35 (GMT)
committer: Takayuki MATSUOKA <takayuki.matsuoka@gmail.com> 2015-03-24 22:52:35 (GMT)
commit: 19665c93ea4a25e02970a53ec5d2cd1af9303c43 (patch)
tree: ff3592ce804f231ce945e8579c6eb885feb1e047 /examples/blockStreaming_lineByLine.md
parent: 438fee9169d2267f46ddd54f770b5f22a61ebb86 (diff)
download: lz4-19665c93ea4a25e02970a53ec5d2cd1af9303c43.zip
lz4-19665c93ea4a25e02970a53ec5d2cd1af9303c43.tar.gz
lz4-19665c93ea4a25e02970a53ec5d2cd1af9303c43.tar.bz2
1 files changed, 121 insertions, 0 deletions
diff --git a/examples/blockStreaming_lineByLine.md b/examples/blockStreaming_lineByLine.md
new file mode 100644
index 0000000..1d379a0
--- /dev/null
+++ b/examples/blockStreaming_lineByLine.md
@@ -0,0 +1,121 @@
+# LZ4 Streaming API Example : Line by Line Text Compression
+
+`blockStreaming_lineByLine.c` is LZ4 Straming API example which implements line by line incremental (de)compression.
+
+Please note the following restrictions :
+
+ - Firstly, read "LZ4 Streaming API Basics".
+ - This is relatively advanced application example.
+ - Output file is not compatible with lz4frame and platform dependent.
+
+
+## What's the point of this example ?
+
+ - Line by line incremental (de)compression.
+ - Handle huge file in small amount of memory
+ - Generally better compression ratio than Block API
+ - Non-uniform block size
+
+
+## How the compression works
+
+First of all, allocate "Ring Buffer" for input and LZ4 compressed data buffer for output.
+
+```
+(1)
+    Ring Buffer
+
+    +--------+
+    | Line#1 |
+    +---+----+
+        |
+        v
+     {Out#1}
+
+
+(2)
+    Prefix Mode Dependency
+          +----+
+          |    |
+          v    |
+    +--------+-+------+
+    | Line#1 | Line#2 |
+    +--------+---+----+
+                 |
+                 v
+              {Out#2}
+
+
+(3)
+          Prefix   Prefix
+          +----+   +----+
+          |    |   |    |
+          v    |   v    |
+    +--------+-+------+-+------+
+    | Line#1 | Line#2 | Line#3 |
+    +--------+--------+---+----+
+                          |
+                          v
+                       {Out#3}
+
+
+(4)
+                        External Dictionary Mode
+                +----+   +----+
+                |    |   |    |
+                v    |   v    |
+    ------+--------+-+------+-+--------+
+          |  ....  | Line#X | Line#X+1 |
+    ------+--------+--------+-----+----+
+                            ^     |
+                            |     v
+                            |  {Out#X+1}
+                            |
+                          Reset
+
+
+(5)
+                                    Prefix
+                                    +-----+
+                                    |     |
+                                    v     |
+    ------+--------+--------+----------+--+-------+
+          |  ....  | Line#X | Line#X+1 | Line#X+2 |
+    ------+--------+--------+----------+-----+----+
+                            ^                |
+                            |                v
+                            |            {Out#X+2}
+                            |
+                          Reset
+```
+
+Next (see (1)), read first line to ringbuffer and compress it by `LZ4_compress_continue()`.
+For the first time, LZ4 doesn't know any previous dependencies,
+so it just compress the line without dependencies and generates compressed line {Out#1} to LZ4 compressed data buffer.
+After that, write {Out#1} to the file and forward ringbuffer offset.
+
+Do the same things to second line (see (2)).
+But in this time, LZ4 can use dependency to Line#1 to improve compression ratio.
+This dependency is called "Prefix mode".
+
+Eventually, we'll reach end of ringbuffer at Line#X (see (4)).
+This time, we should reset ringbuffer offset.
+After resetting, at Line#X+1 pointer is not adjacent, but LZ4 still maintain its memory.
+This is called "External Dictionary Mode".
+
+In Line#X+2 (see (5)), finally LZ4 forget almost all memories but still remains Line#X+1.
+This is the same situation as Line#2.
+
+Continue these procedure to the end of text file.
+
+
+## How the decompression works
+
+Decompression will do reverse order.
+
+ - Read compressed line from the file to buffer.
+ - Decompress it to the ringbuffer.
+ - Output decompressed plain text line to the file.
+ - Forward ringbuffer offset. If offset exceedes end of the ringbuffer, reset it.
+
+Continue these procedure to the end of the compressed file.
author	Takayuki MATSUOKA <takayuki.matsuoka@gmail.com>	2015-03-24 22:52:35 (GMT)
committer	Takayuki MATSUOKA <takayuki.matsuoka@gmail.com>	2015-03-24 22:52:35 (GMT)
commit	19665c93ea4a25e02970a53ec5d2cd1af9303c43 (patch)
tree	ff3592ce804f231ce945e8579c6eb885feb1e047 /examples/blockStreaming_lineByLine.md
parent	438fee9169d2267f46ddd54f770b5f22a61ebb86 (diff)
download	lz4-19665c93ea4a25e02970a53ec5d2cd1af9303c43.zip lz4-19665c93ea4a25e02970a53ec5d2cd1af9303c43.tar.gz lz4-19665c93ea4a25e02970a53ec5d2cd1af9303c43.tar.bz2