1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
|
# LZ4 Streaming API Example : Line by Line Text Compression
by *Takayuki Matsuoka*
`blockStreaming_lineByLine.c` is LZ4 Streaming API example which implements line by line incremental (de)compression.
Please note the following restrictions :
- Firstly, read "LZ4 Streaming API Basics".
- This is relatively advanced application example.
- Output file is not compatible with lz4frame and platform dependent.
## What's the point of this example ?
- Line by line incremental (de)compression.
- Handle huge file in small amount of memory
- Generally better compression ratio than Block API
- Non-uniform block size
## How the compression works
First of all, allocate "Ring Buffer" for input and LZ4 compressed data buffer for output.
```
(1)
Ring Buffer
+--------+
| Line#1 |
+---+----+
|
v
{Out#1}
(2)
Prefix Mode Dependency
+----+
| |
v |
+--------+-+------+
| Line#1 | Line#2 |
+--------+---+----+
|
v
{Out#2}
(3)
Prefix Prefix
+----+ +----+
| | | |
v | v |
+--------+-+------+-+------+
| Line#1 | Line#2 | Line#3 |
+--------+--------+---+----+
|
v
{Out#3}
(4)
External Dictionary Mode
+----+ +----+
| | | |
v | v |
------+--------+-+------+-+--------+
| .... | Line#X | Line#X+1 |
------+--------+--------+-----+----+
^ |
| v
| {Out#X+1}
|
Reset
(5)
Prefix
+-----+
| |
v |
------+--------+--------+----------+--+-------+
| .... | Line#X | Line#X+1 | Line#X+2 |
------+--------+--------+----------+-----+----+
^ |
| v
| {Out#X+2}
|
Reset
```
Next (see (1)), read first line to ringbuffer and compress it by `LZ4_compress_continue()`.
For the first time, LZ4 doesn't know any previous dependencies,
so it just compress the line without dependencies and generates compressed line {Out#1} to LZ4 compressed data buffer.
After that, write {Out#1} to the file and forward ringbuffer offset.
Do the same things to second line (see (2)).
But in this time, LZ4 can use dependency to Line#1 to improve compression ratio.
This dependency is called "Prefix mode".
Eventually, we'll reach end of ringbuffer at Line#X (see (4)).
This time, we should reset ringbuffer offset.
After resetting, at Line#X+1 pointer is not adjacent, but LZ4 still maintain its memory.
This is called "External Dictionary Mode".
In Line#X+2 (see (5)), finally LZ4 forget almost all memories but still remains Line#X+1.
This is the same situation as Line#2.
Continue these procedures to the end of text file.
## How the decompression works
Decompression will do reverse order.
- Read compressed line from the file to buffer.
- Decompress it to the ringbuffer.
- Output decompressed plain text line to the file.
- Forward ringbuffer offset. If offset exceeds end of the ringbuffer, reset it.
Continue these procedures to the end of the compressed file.
|