summaryrefslogtreecommitdiffstats
path: root/examples/blockStreaming_lineByLine.md
diff options
context:
space:
mode:
authorTakayuki MATSUOKA <takayuki.matsuoka@gmail.com>2015-03-24 22:52:35 (GMT)
committerTakayuki MATSUOKA <takayuki.matsuoka@gmail.com>2015-03-24 22:52:35 (GMT)
commit19665c93ea4a25e02970a53ec5d2cd1af9303c43 (patch)
treeff3592ce804f231ce945e8579c6eb885feb1e047 /examples/blockStreaming_lineByLine.md
parent438fee9169d2267f46ddd54f770b5f22a61ebb86 (diff)
downloadlz4-19665c93ea4a25e02970a53ec5d2cd1af9303c43.zip
lz4-19665c93ea4a25e02970a53ec5d2cd1af9303c43.tar.gz
lz4-19665c93ea4a25e02970a53ec5d2cd1af9303c43.tar.bz2
Add document for "Line by Line Text Compression" example
Diffstat (limited to 'examples/blockStreaming_lineByLine.md')
-rw-r--r--examples/blockStreaming_lineByLine.md121
1 files changed, 121 insertions, 0 deletions
diff --git a/examples/blockStreaming_lineByLine.md b/examples/blockStreaming_lineByLine.md
new file mode 100644
index 0000000..1d379a0
--- /dev/null
+++ b/examples/blockStreaming_lineByLine.md
@@ -0,0 +1,121 @@
+# LZ4 Streaming API Example : Line by Line Text Compression
+
+`blockStreaming_lineByLine.c` is LZ4 Straming API example which implements line by line incremental (de)compression.
+
+Please note the following restrictions :
+
+ - Firstly, read "LZ4 Streaming API Basics".
+ - This is relatively advanced application example.
+ - Output file is not compatible with lz4frame and platform dependent.
+
+
+## What's the point of this example ?
+
+ - Line by line incremental (de)compression.
+ - Handle huge file in small amount of memory
+ - Generally better compression ratio than Block API
+ - Non-uniform block size
+
+
+## How the compression works
+
+First of all, allocate "Ring Buffer" for input and LZ4 compressed data buffer for output.
+
+```
+(1)
+ Ring Buffer
+
+ +--------+
+ | Line#1 |
+ +---+----+
+ |
+ v
+ {Out#1}
+
+
+(2)
+ Prefix Mode Dependency
+ +----+
+ | |
+ v |
+ +--------+-+------+
+ | Line#1 | Line#2 |
+ +--------+---+----+
+ |
+ v
+ {Out#2}
+
+
+(3)
+ Prefix Prefix
+ +----+ +----+
+ | | | |
+ v | v |
+ +--------+-+------+-+------+
+ | Line#1 | Line#2 | Line#3 |
+ +--------+--------+---+----+
+ |
+ v
+ {Out#3}
+
+
+(4)
+ External Dictionary Mode
+ +----+ +----+
+ | | | |
+ v | v |
+ ------+--------+-+------+-+--------+
+ | .... | Line#X | Line#X+1 |
+ ------+--------+--------+-----+----+
+ ^ |
+ | v
+ | {Out#X+1}
+ |
+ Reset
+
+
+(5)
+ Prefix
+ +-----+
+ | |
+ v |
+ ------+--------+--------+----------+--+-------+
+ | .... | Line#X | Line#X+1 | Line#X+2 |
+ ------+--------+--------+----------+-----+----+
+ ^ |
+ | v
+ | {Out#X+2}
+ |
+ Reset
+```
+
+Next (see (1)), read first line to ringbuffer and compress it by `LZ4_compress_continue()`.
+For the first time, LZ4 doesn't know any previous dependencies,
+so it just compress the line without dependencies and generates compressed line {Out#1} to LZ4 compressed data buffer.
+After that, write {Out#1} to the file and forward ringbuffer offset.
+
+Do the same things to second line (see (2)).
+But in this time, LZ4 can use dependency to Line#1 to improve compression ratio.
+This dependency is called "Prefix mode".
+
+Eventually, we'll reach end of ringbuffer at Line#X (see (4)).
+This time, we should reset ringbuffer offset.
+After resetting, at Line#X+1 pointer is not adjacent, but LZ4 still maintain its memory.
+This is called "External Dictionary Mode".
+
+In Line#X+2 (see (5)), finally LZ4 forget almost all memories but still remains Line#X+1.
+This is the same situation as Line#2.
+
+Continue these procedure to the end of text file.
+
+
+## How the decompression works
+
+Decompression will do reverse order.
+
+ - Read compressed line from the file to buffer.
+ - Decompress it to the ringbuffer.
+ - Output decompressed plain text line to the file.
+ - Forward ringbuffer offset. If offset exceedes end of the ringbuffer, reset it.
+
+Continue these procedure to the end of the compressed file.