diff options
Diffstat (limited to 'doc/lz4_manual.html')
-rw-r--r-- | doc/lz4_manual.html | 254 |
1 files changed, 173 insertions, 81 deletions
diff --git a/doc/lz4_manual.html b/doc/lz4_manual.html index 6b7935d..e5044fe 100644 --- a/doc/lz4_manual.html +++ b/doc/lz4_manual.html @@ -1,10 +1,10 @@ <html> <head> <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"> -<title>1.8.1 Manual</title> +<title>1.8.2 Manual</title> </head> <body> -<h1>1.8.1 Manual</h1> +<h1>1.8.2 Manual</h1> <hr> <a name="Contents"></a><h2>Contents</h2> <ol> @@ -15,8 +15,9 @@ <li><a href="#Chapter5">Advanced Functions</a></li> <li><a href="#Chapter6">Streaming Compression Functions</a></li> <li><a href="#Chapter7">Streaming Decompression Functions</a></li> -<li><a href="#Chapter8">Private definitions</a></li> -<li><a href="#Chapter9">Obsolete Functions</a></li> +<li><a href="#Chapter8">Unstable declarations</a></li> +<li><a href="#Chapter9">Private definitions</a></li> +<li><a href="#Chapter10">Obsolete Functions</a></li> </ol> <hr> <a name="Chapter1"></a><h2>Introduction</h2><pre> @@ -42,9 +43,9 @@ <a name="Chapter2"></a><h2>Version</h2><pre></pre> -<pre><b>int LZ4_versionNumber (void); </b>/**< library version number; to be used when checking dll version */<b> +<pre><b>int LZ4_versionNumber (void); </b>/**< library version number; useful to check dll version */<b> </b></pre><BR> -<pre><b>const char* LZ4_versionString (void); </b>/**< library version string; to be used when checking dll version */<b> +<pre><b>const char* LZ4_versionString (void); </b>/**< library version string; unseful to check dll version */<b> </b></pre><BR> <a name="Chapter3"></a><h2>Tuning parameter</h2><pre></pre> @@ -53,7 +54,7 @@ #endif </b><p> Memory usage formula : N->2^N Bytes (examples : 10 -> 1KB; 12 -> 4KB ; 16 -> 64KB; 20 -> 1MB; etc.) Increasing memory usage improves compression ratio - Reduced memory usage can improve speed, due to cache effect + Reduced memory usage may improve speed, thanks to cache effect Default value is 14, for 16KB, which nicely fits into Intel x86 L1 cache </p></pre><BR> @@ -65,12 +66,12 @@ into already allocated 'dst' buffer of size 'dstCapacity'. Compression is guaranteed to succeed if 'dstCapacity' >= LZ4_compressBound(srcSize). It also runs faster, so it's a recommended setting. - If the function cannot compress 'src' into a limited 'dst' budget, + If the function cannot compress 'src' into a more limited 'dst' budget, compression stops *immediately*, and the function result is zero. - As a consequence, 'dst' content is not valid. - This function never writes outside 'dst' buffer, nor read outside 'source' buffer. - srcSize : supported max value is LZ4_MAX_INPUT_VALUE - dstCapacity : full or partial size of buffer 'dst' (which must be already allocated) + Note : as a consequence, 'dst' content is not valid. + Note 2 : This function is protected against buffer overflow scenarios (never writes outside 'dst' buffer, nor read outside 'source' buffer). + srcSize : max supported value is LZ4_MAX_INPUT_SIZE. + dstCapacity : size of buffer 'dst' (which must be already allocated) return : the number of bytes written into buffer 'dst' (necessarily <= dstCapacity) or 0 if compression fails </p></pre><BR> @@ -81,8 +82,7 @@ return : the number of bytes decompressed into destination buffer (necessarily <= dstCapacity) If destination buffer is not large enough, decoding will stop and output an error code (negative value). If the source stream is detected malformed, the function will stop decoding and return a negative result. - This function is protected against buffer overflow exploits, including malicious data packets. - It never writes outside output buffer, nor reads outside input buffer. + This function is protected against malicious data packets. </p></pre><BR> <a name="Chapter5"></a><h2>Advanced Functions</h2><pre></pre> @@ -91,18 +91,18 @@ </b><p> Provides the maximum size that LZ4 compression may output in a "worst case" scenario (input data not compressible) This function is primarily useful for memory allocation purposes (destination buffer size). Macro LZ4_COMPRESSBOUND() is also provided for compilation-time evaluation (stack memory allocation for example). - Note that LZ4_compress_default() compress faster when dest buffer size is >= LZ4_compressBound(srcSize) + Note that LZ4_compress_default() compresses faster when dstCapacity is >= LZ4_compressBound(srcSize) inputSize : max supported value is LZ4_MAX_INPUT_SIZE return : maximum output size in a "worst case" scenario - or 0, if input size is too large ( > LZ4_MAX_INPUT_SIZE) + or 0, if input size is incorrect (too large or negative) </p></pre><BR> <pre><b>int LZ4_compress_fast (const char* src, char* dst, int srcSize, int dstCapacity, int acceleration); -</b><p> Same as LZ4_compress_default(), but allows to select an "acceleration" factor. +</b><p> Same as LZ4_compress_default(), but allows selection of "acceleration" factor. The larger the acceleration value, the faster the algorithm, but also the lesser the compression. It's a trade-off. It can be fine tuned, with each successive value providing roughly +~3% to speed. An acceleration value of "1" is the same as regular LZ4_compress_default() - Values <= 0 will be replaced by ACCELERATION_DEFAULT (see lz4.c), which is 1. + Values <= 0 will be replaced by ACCELERATION_DEFAULT (currently == 1, see lz4.c). </p></pre><BR> <pre><b>int LZ4_sizeofState(void); @@ -125,26 +125,30 @@ int LZ4_compress_fast_extState (void* state, const char* src, char* dst, int src </p></pre><BR> <pre><b>int LZ4_decompress_fast (const char* src, char* dst, int originalSize); -</b><p> originalSize : is the original uncompressed size - return : the number of bytes read from the source buffer (in other words, the compressed size) - If the source stream is detected malformed, the function will stop decoding and return a negative result. - Destination buffer must be already allocated. Its size must be >= 'originalSize' bytes. - note : This function respects memory boundaries for *properly formed* compressed data. - It is a bit faster than LZ4_decompress_safe(). - However, it does not provide any protection against intentionally modified data stream (malicious input). - Use this function in trusted environment only (data to decode comes from a trusted source). +</b><p>This function is a bit faster than LZ4_decompress_safe(), +but it may misbehave on malformed input because it doesn't perform full validation of compressed data. + originalSize : is the uncompressed size to regenerate + Destination buffer must be already allocated, and its size must be >= 'originalSize' bytes. + return : number of bytes read from source buffer (== compressed size). + If the source stream is detected malformed, the function stops decoding and return a negative result. + note : This function is only usable if the originalSize of uncompressed data is known in advance. + The caller should also check that all the compressed input has been consumed properly, + i.e. that the return value matches the size of the buffer with compressed input. + The function never writes past the output buffer. However, since it doesn't know its 'src' size, + it may read past the intended input. Also, because match offsets are not validated during decoding, + reads from 'src' may underflow. Use this function in trusted environment **only**. </p></pre><BR> <pre><b>int LZ4_decompress_safe_partial (const char* src, char* dst, int srcSize, int targetOutputSize, int dstCapacity); </b><p> This function decompress a compressed block of size 'srcSize' at position 'src' into destination buffer 'dst' of size 'dstCapacity'. The function will decompress a minimum of 'targetOutputSize' bytes, and stop after that. - However, it's not accurate, and may write more than 'targetOutputSize' (but <= dstCapacity). + However, it's not accurate, and may write more than 'targetOutputSize' (but always <= dstCapacity). @return : the number of bytes decoded in the destination buffer (necessarily <= dstCapacity) - Note : this number can be < 'targetOutputSize' should the compressed block contain less data. - Always control how many bytes were decoded. - If the source stream is detected malformed, the function will stop decoding and return a negative result. - This function never writes outside of output buffer, and never reads outside of input buffer. It is therefore protected against malicious data packets. + Note : this number can also be < targetOutputSize, if compressed block contains less data. + Therefore, always control how many bytes were decoded. + If source stream is detected malformed, function returns a negative result. + This function is protected against malicious data packets. </p></pre><BR> <a name="Chapter6"></a><h2>Streaming Compression Functions</h2><pre></pre> @@ -171,25 +175,29 @@ int LZ4_freeStream (LZ4_stream_t* streamPtr); </p></pre><BR> <pre><b>int LZ4_compress_fast_continue (LZ4_stream_t* streamPtr, const char* src, char* dst, int srcSize, int dstCapacity, int acceleration); -</b><p> Compress content into 'src' using data from previously compressed blocks, improving compression ratio. +</b><p> Compress 'src' content using data from previously compressed blocks, for better compression ratio. 'dst' buffer must be already allocated. If dstCapacity >= LZ4_compressBound(srcSize), compression is guaranteed to succeed, and runs faster. - Important : Up to 64KB of previously compressed data is assumed to remain present and unmodified in memory ! - Special 1 : If input buffer is a double-buffer, it can have any size, including < 64 KB. - Special 2 : If input buffer is a ring-buffer, it can have any size, including < 64 KB. + Important : The previous 64KB of compressed data is assumed to remain present and unmodified in memory! + + Special 1 : When input is a double-buffer, they can have any size, including < 64 KB. + Make sure that buffers are separated by at least one byte. + This way, each block only depends on previous block. + Special 2 : If input buffer is a ring-buffer, it can have any size, including < 64 KB. @return : size of compressed block - or 0 if there is an error (typically, compressed data cannot fit into 'dst') + or 0 if there is an error (typically, cannot fit into 'dst'). After an error, the stream status is invalid, it can only be reset or freed. </p></pre><BR> -<pre><b>int LZ4_saveDict (LZ4_stream_t* streamPtr, char* safeBuffer, int dictSize); -</b><p> If previously compressed data block is not guaranteed to remain available at its current memory location, +<pre><b>int LZ4_saveDict (LZ4_stream_t* streamPtr, char* safeBuffer, int maxDictSize); +</b><p> If last 64KB data cannot be guaranteed to remain available at its current memory location, save it into a safer place (char* safeBuffer). - Note : it's not necessary to call LZ4_loadDict() after LZ4_saveDict(), dictionary is immediately usable. - @return : saved dictionary size in bytes (necessarily <= dictSize), or 0 if error. + This is schematically equivalent to a memcpy() followed by LZ4_loadDict(), + but is much faster, because LZ4_saveDict() doesn't need to rebuild tables. + @return : saved dictionary size in bytes (necessarily <= maxDictSize), or 0 if error. </p></pre><BR> @@ -198,37 +206,59 @@ int LZ4_freeStream (LZ4_stream_t* streamPtr); <pre><b>LZ4_streamDecode_t* LZ4_createStreamDecode(void); int LZ4_freeStreamDecode (LZ4_streamDecode_t* LZ4_stream); -</b><p> creation / destruction of streaming decompression tracking structure. - A tracking structure can be re-used multiple times sequentially. +</b><p> creation / destruction of streaming decompression tracking context. + A tracking context can be re-used multiple times. + </p></pre><BR> <pre><b>int LZ4_setStreamDecode (LZ4_streamDecode_t* LZ4_streamDecode, const char* dictionary, int dictSize); -</b><p> An LZ4_streamDecode_t structure can be allocated once and re-used multiple times. +</b><p> An LZ4_streamDecode_t context can be allocated once and re-used multiple times. Use this function to start decompression of a new stream of blocks. - A dictionary can optionnally be set. Use NULL or size 0 for a simple reset order. + A dictionary can optionnally be set. Use NULL or size 0 for a reset order. + Dictionary is presumed stable : it must remain accessible and unmodified during next decompression. @return : 1 if OK, 0 if error </p></pre><BR> +<pre><b>int LZ4_decoderRingBufferSize(int maxBlockSize); +#define LZ4_DECODER_RING_BUFFER_SIZE(mbs) (65536 + 14 + (mbs)) </b>/* for static allocation; mbs presumed valid */<b> +</b><p> Note : in a ring buffer scenario (optional), + blocks are presumed decompressed next to each other + up to the moment there is not enough remaining space for next block (remainingSize < maxBlockSize), + at which stage it resumes from beginning of ring buffer. + When setting such a ring buffer for streaming decompression, + provides the minimum size of this ring buffer + to be compatible with any source respecting maxBlockSize condition. + @return : minimum ring buffer size, + or 0 if there is an error (invalid maxBlockSize). + +</p></pre><BR> + <pre><b>int LZ4_decompress_safe_continue (LZ4_streamDecode_t* LZ4_streamDecode, const char* src, char* dst, int srcSize, int dstCapacity); int LZ4_decompress_fast_continue (LZ4_streamDecode_t* LZ4_streamDecode, const char* src, char* dst, int originalSize); </b><p> These decoding functions allow decompression of consecutive blocks in "streaming" mode. A block is an unsplittable entity, it must be presented entirely to a decompression function. - Decompression functions only accept one block at a time. - Previously decoded blocks *must* remain available at the memory position where they were decoded (up to 64 KB). - - Special : if application sets a ring buffer for decompression, it must respect one of the following conditions : - - Exactly same size as encoding buffer, with same update rule (block boundaries at same positions) - In which case, the decoding & encoding ring buffer can have any size, including very small ones ( < 64 KB). - - Larger than encoding buffer, by a minimum of maxBlockSize more bytes. - maxBlockSize is implementation dependent. It's the maximum size of any single block. + Decompression functions only accepts one block at a time. + The last 64KB of previously decoded data *must* remain available and unmodified at the memory position where they were decoded. + If less than 64KB of data has been decoded, all the data must be present. + + Special : if decompression side sets a ring buffer, it must respect one of the following conditions : + - Decompression buffer size is _at least_ LZ4_decoderRingBufferSize(maxBlockSize). + maxBlockSize is the maximum size of any single block. It can have any value > 16 bytes. + In which case, encoding and decoding buffers do not need to be synchronized. + Actually, data can be produced by any source compliant with LZ4 format specification, and respecting maxBlockSize. + - Synchronized mode : + Decompression buffer size is _exactly_ the same as compression buffer size, + and follows exactly same update rule (block boundaries at same positions), + and decoding function is provided with exact decompressed size of each block (exception for last block of the stream), + _then_ decoding & encoding ring buffer can have any size, including small ones ( < 64 KB). + - Decompression buffer is larger than encoding buffer, by a minimum of maxBlockSize more bytes. In which case, encoding and decoding buffers do not need to be synchronized, and encoding ring buffer can have any size, including small ones ( < 64 KB). - - _At least_ 64 KB + 8 bytes + maxBlockSize. - In which case, encoding and decoding buffers do not need to be synchronized, - and encoding ring buffer can have any size, including larger than decoding buffer. - Whenever these conditions are not possible, save the last 64KB of decoded data into a safe buffer, - and indicate where it is saved using LZ4_setStreamDecode() before decompressing next block. + + Whenever these conditions are not possible, + save the last 64KB of decoded data into a safe buffer where it can't be modified during decompression, + then indicate where this data is saved using LZ4_setStreamDecode(), before decompressing next block. </p></pre><BR> <pre><b>int LZ4_decompress_safe_usingDict (const char* src, char* dst, int srcSize, int dstCapcity, const char* dictStart, int dictSize); @@ -236,25 +266,98 @@ int LZ4_decompress_fast_usingDict (const char* src, char* dst, int originalSize, </b><p> These decoding functions work the same as a combination of LZ4_setStreamDecode() followed by LZ4_decompress_*_continue() They are stand-alone, and don't need an LZ4_streamDecode_t structure. + Dictionary is presumed stable : it must remain accessible and unmodified during next decompression. + +</p></pre><BR> + +<a name="Chapter8"></a><h2>Unstable declarations</h2><pre> + Declarations in this section should be considered unstable. + Use at your own peril, etc., etc. + They may be removed in the future. + Their signatures may change. +<BR></pre> + +<pre><b>void LZ4_resetStream_fast (LZ4_stream_t* streamPtr); +</b><p> Use this, like LZ4_resetStream(), to prepare a context for a new chain of + calls to a streaming API (e.g., LZ4_compress_fast_continue()). + + Note: + Using this in advance of a non- streaming-compression function is redundant, + and potentially bad for performance, since they all perform their own custom + reset internally. + + Differences from LZ4_resetStream(): + When an LZ4_stream_t is known to be in a internally coherent state, + it can often be prepared for a new compression with almost no work, only + sometimes falling back to the full, expensive reset that is always required + when the stream is in an indeterminate state (i.e., the reset performed by + LZ4_resetStream()). + + LZ4_streams are guaranteed to be in a valid state when: + - returned from LZ4_createStream() + - reset by LZ4_resetStream() + - memset(stream, 0, sizeof(LZ4_stream_t)), though this is discouraged + - the stream was in a valid state and was reset by LZ4_resetStream_fast() + - the stream was in a valid state and was then used in any compression call + that returned success + - the stream was in an indeterminate state and was used in a compression + call that fully reset the state (e.g., LZ4_compress_fast_extState()) and + that returned success + + When a stream isn't known to be in a valid state, it is not safe to pass to + any fastReset or streaming function. It must first be cleansed by the full + LZ4_resetStream(). </p></pre><BR> -<a name="Chapter8"></a><h2>Private definitions</h2><pre> +<pre><b>int LZ4_compress_fast_extState_fastReset (void* state, const char* src, char* dst, int srcSize, int dstCapacity, int acceleration); +</b><p> A variant of LZ4_compress_fast_extState(). + + Using this variant avoids an expensive initialization step. It is only safe + to call if the state buffer is known to be correctly initialized already + (see above comment on LZ4_resetStream_fast() for a definition of "correctly + initialized"). From a high level, the difference is that this function + initializes the provided state with a call to something like + LZ4_resetStream_fast() while LZ4_compress_fast_extState() starts with a + call to LZ4_resetStream(). + +</p></pre><BR> + +<pre><b>void LZ4_attach_dictionary(LZ4_stream_t *working_stream, const LZ4_stream_t *dictionary_stream); +</b><p> This is an experimental API that allows for the efficient use of a + static dictionary many times. + + Rather than re-loading the dictionary buffer into a working context before + each compression, or copying a pre-loaded dictionary's LZ4_stream_t into a + working LZ4_stream_t, this function introduces a no-copy setup mechanism, + in which the working stream references the dictionary stream in-place. + + Several assumptions are made about the state of the dictionary stream. + Currently, only streams which have been prepared by LZ4_loadDict() should + be expected to work. + + Alternatively, the provided dictionary stream pointer may be NULL, in which + case any existing dictionary stream is unset. + + If a dictionary is provided, it replaces any pre-existing stream history. + The dictionary contents are the only history that can be referenced and + logically immediately precede the data compressed in the first subsequent + compression call. + + The dictionary will only remain attached to the working stream through the + first compression call, at the end of which it is cleared. The dictionary + stream (and source buffer) must remain in-place / accessible / unchanged + through the completion of the first compression call on the stream. + +</p></pre><BR> + +<a name="Chapter9"></a><h2>Private definitions</h2><pre> Do not use these definitions. They are exposed to allow static allocation of `LZ4_stream_t` and `LZ4_streamDecode_t`. Using these definitions will expose code to API and/or ABI break in future versions of the library. <BR></pre> <pre><b>typedef struct { - uint32_t hashTable[LZ4_HASH_SIZE_U32]; - uint32_t currentOffset; - uint32_t initCheck; - const uint8_t* dictionary; - uint8_t* bufferStart; </b>/* obsolete, used for slideInputBuffer */<b> - uint32_t dictSize; -} LZ4_stream_t_internal; -</b></pre><BR> -<pre><b>typedef struct { const uint8_t* externalDict; size_t extDictSize; const uint8_t* prefixEnd; @@ -262,15 +365,6 @@ int LZ4_decompress_fast_usingDict (const char* src, char* dst, int originalSize, } LZ4_streamDecode_t_internal; </b></pre><BR> <pre><b>typedef struct { - unsigned int hashTable[LZ4_HASH_SIZE_U32]; - unsigned int currentOffset; - unsigned int initCheck; - const unsigned char* dictionary; - unsigned char* bufferStart; </b>/* obsolete, used for slideInputBuffer */<b> - unsigned int dictSize; -} LZ4_stream_t_internal; -</b></pre><BR> -<pre><b>typedef struct { const unsigned char* externalDict; size_t extDictSize; const unsigned char* prefixEnd; @@ -305,17 +399,15 @@ union LZ4_streamDecode_u { </p></pre><BR> -<a name="Chapter9"></a><h2>Obsolete Functions</h2><pre></pre> +<a name="Chapter10"></a><h2>Obsolete Functions</h2><pre></pre> <pre><b>#ifdef LZ4_DISABLE_DEPRECATE_WARNINGS # define LZ4_DEPRECATED(message) </b>/* disable deprecation warnings */<b> #else # define LZ4_GCC_VERSION (__GNUC__ * 100 + __GNUC_MINOR__) -# if defined(__clang__) </b>/* clang doesn't handle mixed C++11 and CNU attributes */<b> -# define LZ4_DEPRECATED(message) __attribute__((deprecated(message))) -# elif defined (__cplusplus) && (__cplusplus >= 201402) </b>/* C++14 or greater */<b> +# if defined (__cplusplus) && (__cplusplus >= 201402) </b>/* C++14 or greater */<b> # define LZ4_DEPRECATED(message) [[deprecated(message)]] -# elif (LZ4_GCC_VERSION >= 405) +# elif (LZ4_GCC_VERSION >= 405) || defined(__clang__) # define LZ4_DEPRECATED(message) __attribute__((deprecated(message))) # elif (LZ4_GCC_VERSION >= 301) # define LZ4_DEPRECATED(message) __attribute__((deprecated)) |