Merge pull request #63 from t-mat/comment-on-example-directory

Comment on example directory
author: Yann Collet <yann.collet.73@gmail.com> 2015-03-25 12:31:06 (GMT)
committer: Yann Collet <yann.collet.73@gmail.com> 2015-03-25 12:31:06 (GMT)
commit: e652285556c421901fc1bcc9a571c3ddba767181 (patch)
tree: 150a8c0be529c115c2c3099a7190a4bf3fbfd2e2 /examples
parent: 80e71c6e8b0cbcd3b9976ded45cef1474a34b40c (diff)
parent: 2af52a90b3759e589d608782a987d2b3d73abefb (diff)
download: lz4-e652285556c421901fc1bcc9a571c3ddba767181.zip
lz4-e652285556c421901fc1bcc9a571c3ddba767181.tar.gz
lz4-e652285556c421901fc1bcc9a571c3ddba767181.tar.bz2
4 files changed, 315 insertions, 0 deletions
diff --git a/examples/README.md b/examples/README.md
new file mode 100644
index 0000000..1b62d9e
--- /dev/null
+++ b/examples/README.md
@@ -0,0 +1,8 @@
+# LZ4 examples
+
+## Documents
+
+ - [Streaming API Basics](streaming_api_basics.md)
+ - Examples
+     - [Double Buffer](blockStreaming_doubleBuffer.md)
+     - [Line by Line Text Compression](blockStreaming_lineByLine.md)
diff --git a/examples/blockStreaming_doubleBuffer.md b/examples/blockStreaming_doubleBuffer.md
new file mode 100644
index 0000000..cd295df
--- /dev/null
+++ b/examples/blockStreaming_doubleBuffer.md
@@ -0,0 +1,99 @@
+# LZ4 Streaming API Example : Double Buffer
+
+`blockStreaming_doubleBuffer.c` is LZ4 Straming API example which implements double buffer (de)compression.
+
+Please note :
+
+ - Firstly, read "LZ4 Streaming API Basics".
+ - This is relatively advanced application example.
+ - Output file is not compatible with lz4frame and platform dependent.
+
+
+## What's the point of this example ?
+
+ - Handle huge file in small amount of memory
+ - Always better compression ratio than Block API
+ - Uniform block size
+
+
+## How the compression works
+
+First of all, allocate "Double Buffer" for input and LZ4 compressed data buffer for output.
+Double buffer has two pages, "first" page (Page#1) and "second" page (Page#2).
+
+```
+        Double Buffer
+
+      Page#1    Page#2
+    +---------+---------+
+    | Block#1 |         |
+    +----+----+---------+
+         |
+         v
+      {Out#1}
+
+
+      Prefix Dependency
+         +---------+
+         |         |
+         v         |
+    +---------+----+----+
+    | Block#1 | Block#2 |
+    +---------+----+----+
+                   |
+                   v
+                {Out#2}
+
+
+   External Dictionary Mode
+         +---------+
+         |         |
+         |         v
+    +----+----+---------+
+    | Block#3 | Block#2 |
+    +----+----+---------+
+         |
+         v
+      {Out#3}
+
+
+      Prefix Dependency
+         +---------+
+         |         |
+         v         |
+    +---------+----+----+
+    | Block#3 | Block#4 |
+    +---------+----+----+
+                   |
+                   v
+                {Out#4}
+```
+
+Next, read first block to double buffer's first page. And compress it by `LZ4_compress_continue()`.
+For the first time, LZ4 doesn't know any previous dependencies,
+so it just compress the line without dependencies and generates compressed block {Out#1} to LZ4 compressed data buffer.
+After that, write {Out#1} to the file.
+
+Next, read second block to double buffer's second page. And compress it.
+In this time, LZ4 can use dependency to Block#1 to improve compression ratio.
+This dependency is called "Prefix mode".
+
+Next, read third block to double buffer's *first* page. And compress it.
+Also this time, LZ4 can use dependency to Block#2.
+This dependency is called "External Dictonaly mode".
+
+Continue these procedure to the end of the file.
+
+
+## How the decompression works
+
+Decompression will do reverse order.
+
+ - Read first compressed block.
+ - Decompress it to the first page and write that page to the file.
+ - Read second compressed block.
+ - Decompress it to the second page and write that page to the file.
+ - Read third compressed block.
+ - Decompress it to the *first* page and write that page to the file.
+
+Continue these procedure to the end of the compressed file.
diff --git a/examples/blockStreaming_lineByLine.md b/examples/blockStreaming_lineByLine.md
new file mode 100644
index 0000000..1d379a0
--- /dev/null
+++ b/examples/blockStreaming_lineByLine.md
@@ -0,0 +1,121 @@
+# LZ4 Streaming API Example : Line by Line Text Compression
+
+`blockStreaming_lineByLine.c` is LZ4 Straming API example which implements line by line incremental (de)compression.
+
+Please note the following restrictions :
+
+ - Firstly, read "LZ4 Streaming API Basics".
+ - This is relatively advanced application example.
+ - Output file is not compatible with lz4frame and platform dependent.
+
+
+## What's the point of this example ?
+
+ - Line by line incremental (de)compression.
+ - Handle huge file in small amount of memory
+ - Generally better compression ratio than Block API
+ - Non-uniform block size
+
+
+## How the compression works
+
+First of all, allocate "Ring Buffer" for input and LZ4 compressed data buffer for output.
+
+```
+(1)
+    Ring Buffer
+
+    +--------+
+    | Line#1 |
+    +---+----+
+        |
+        v
+     {Out#1}
+
+
+(2)
+    Prefix Mode Dependency
+          +----+
+          |    |
+          v    |
+    +--------+-+------+
+    | Line#1 | Line#2 |
+    +--------+---+----+
+                 |
+                 v
+              {Out#2}
+
+
+(3)
+          Prefix   Prefix
+          +----+   +----+
+          |    |   |    |
+          v    |   v    |
+    +--------+-+------+-+------+
+    | Line#1 | Line#2 | Line#3 |
+    +--------+--------+---+----+
+                          |
+                          v
+                       {Out#3}
+
+
+(4)
+                        External Dictionary Mode
+                +----+   +----+
+                |    |   |    |
+                v    |   v    |
+    ------+--------+-+------+-+--------+
+          |  ....  | Line#X | Line#X+1 |
+    ------+--------+--------+-----+----+
+                            ^     |
+                            |     v
+                            |  {Out#X+1}
+                            |
+                          Reset
+
+
+(5)
+                                    Prefix
+                                    +-----+
+                                    |     |
+                                    v     |
+    ------+--------+--------+----------+--+-------+
+          |  ....  | Line#X | Line#X+1 | Line#X+2 |
+    ------+--------+--------+----------+-----+----+
+                            ^                |
+                            |                v
+                            |            {Out#X+2}
+                            |
+                          Reset
+```
+
+Next (see (1)), read first line to ringbuffer and compress it by `LZ4_compress_continue()`.
+For the first time, LZ4 doesn't know any previous dependencies,
+so it just compress the line without dependencies and generates compressed line {Out#1} to LZ4 compressed data buffer.
+After that, write {Out#1} to the file and forward ringbuffer offset.
+
+Do the same things to second line (see (2)).
+But in this time, LZ4 can use dependency to Line#1 to improve compression ratio.
+This dependency is called "Prefix mode".
+
+Eventually, we'll reach end of ringbuffer at Line#X (see (4)).
+This time, we should reset ringbuffer offset.
+After resetting, at Line#X+1 pointer is not adjacent, but LZ4 still maintain its memory.
+This is called "External Dictionary Mode".
+
+In Line#X+2 (see (5)), finally LZ4 forget almost all memories but still remains Line#X+1.
+This is the same situation as Line#2.
+
+Continue these procedure to the end of text file.
+
+
+## How the decompression works
+
+Decompression will do reverse order.
+
+ - Read compressed line from the file to buffer.
+ - Decompress it to the ringbuffer.
+ - Output decompressed plain text line to the file.
+ - Forward ringbuffer offset. If offset exceedes end of the ringbuffer, reset it.
+
+Continue these procedure to the end of the compressed file.
diff --git a/examples/streaming_api_basics.md b/examples/streaming_api_basics.md
new file mode 100644
index 0000000..6c2632f
--- /dev/null
+++ b/examples/streaming_api_basics.md
@@ -0,0 +1,87 @@
+# LZ4 Streaming API Basics
+
+## LZ4 API sets
+
+LZ4 has the following API sets :
+
+ - "Auto Framing" API (lz4frame.h) :
+   This is most recommended API for usual application.
+   It guarantees interoperability with other LZ4 framing format compliant tools/libraries
+   such as LZ4 command line utility, node-lz4, etc.
+ - "Block" API : This is recommended for simple purpose.
+   It compress single raw memory block to LZ4 memory block and vice versa.
+ - "Streaming" API : This is designed for complex thing.
+   For example, compress huge stream data in restricted memory environment.
+
+Basically, you should use "Auto Framing" API.
+But if you want to write advanced application, it's time to use Block or Streaming APIs.
+
+
+## What is difference between Block and Streaming API ?
+
+Block API (de)compresses single contiguous memory block.
+In other words, LZ4 library find redundancy from single contiguous memory block.
+Streaming API does same thing but (de)compress multiple adjacent contiguous memory block.
+So LZ4 library could find more redundancy than Block API.
+
+The following figure shows difference between API and block sizes.
+In these figures, original data is splitted to 4KiBytes contiguous chunks.
+
+```
+Original Data
+    +---------------+---------------+----+----+----+
+    | 4KiB Chunk A  | 4KiB Chunk B  | C  | D  |... |
+    +---------------+---------------+----+----+----+
+
+Example (1) : Block API, 4KiB Block
+    +---------------+---------------+----+----+----+
+    | 4KiB Chunk A  | 4KiB Chunk B  | C  | D  |... |
+    +---------------+---------------+----+----+----+
+    | Block #1      | Block #2      | #3 | #4 |... |
+    +---------------+---------------+----+----+----+
+    
+                    (No Dependency)
+
+
+Example (2) : Block API, 8KiB Block
+    +---------------+---------------+----+----+----+
+    | 4KiB Chunk A  | 4KiB Chunk B  | C  | D  |... |
+    +---------------+---------------+----+----+----+
+    |            Block #1           |Block #2 |... |
+    +--------------------+----------+-------+-+----+
+          ^              |             ^    |
+          |              |             |    |
+          +--------------+             +----+
+          Internal Dependency          Internal Dependency
+
+
+Example (3) : Streaming API, 4KiB Block
+    +---------------+---------------+-----+----+----+
+    | 4KiB Chunk A  | 4KiB Chunk B  | C   | D  |... |
+    +---------------+---------------+-----+----+----+
+    | Block #1      | Block #2      | #3  | #4 |... |
+    +---------------+----+----------+-+---+-+--+----+
+          ^              |   ^        | ^   |
+          |              |   |        | |   |
+          +--------------+   +--------+ +---+
+          Dependency         Dependency Dependency
+```
+
+ - In example (1), there is no dependency.
+   All blocks are compressed independently.
+ - In example (2), naturally 8KiBytes block has internal dependency.
+   But still block #1 and #2 are compressed independently.
+ - In example (3), block #2 has dependency to #1,
+   also #3 has dependency to #2 and #1, #4 has #3, #2 and #1, and so on.
+
+Here, we can observe difference between example (2) and (3).
+In (2), there's no dependency between chunk B and C, but (3) has dependency between B and C.
+This dependency improves compression ratio.
+
+
+## Restriction of Streaming API
+
+For the efficiency, Streaming API doesn't keep mirror copy of dependent (de)compressed memory.
+This means users should keep these dependent (de)compressed memory explicitly.
+Usually, "Dependent memory" is previous adjacent contiguous memory up to 64KiBytes.
+LZ4 will not access further memories.
author	Yann Collet <yann.collet.73@gmail.com>	2015-03-25 12:31:06 (GMT)
committer	Yann Collet <yann.collet.73@gmail.com>	2015-03-25 12:31:06 (GMT)
commit	e652285556c421901fc1bcc9a571c3ddba767181 (patch)
tree	150a8c0be529c115c2c3099a7190a4bf3fbfd2e2 /examples
parent	80e71c6e8b0cbcd3b9976ded45cef1474a34b40c (diff)
parent	2af52a90b3759e589d608782a987d2b3d73abefb (diff)
download	lz4-e652285556c421901fc1bcc9a571c3ddba767181.zip lz4-e652285556c421901fc1bcc9a571c3ddba767181.tar.gz lz4-e652285556c421901fc1bcc9a571c3ddba767181.tar.bz2