1.8.3 Manual


Contents

  1. Introduction
  2. Version
  3. Tuning parameter
  4. Simple Functions
  5. Advanced Functions
  6. Streaming Compression Functions
  7. Streaming Decompression Functions
  8. Unstable declarations
  9. Private definitions
  10. Obsolete Functions

Introduction

  LZ4 is lossless compression algorithm, providing compression speed at 400 MB/s per core,
  scalable with multi-cores CPU. It features an extremely fast decoder, with speed in
  multiple GB/s per core, typically reaching RAM speed limits on multi-core systems.

  The LZ4 compression library provides in-memory compression and decompression functions.
  Compression can be done in:
    - a single step (described as Simple Functions)
    - a single step, reusing a context (described in Advanced Functions)
    - unbounded multiple steps (described as Streaming compression)

  lz4.h provides block compression functions. It gives full buffer control to user.
  Decompressing an lz4-compressed block also requires metadata (such as compressed size).
  Each application is free to encode such metadata in whichever way it wants.

  An additional format, called LZ4 frame specification (doc/lz4_Frame_format.md),
  take care of encoding standard metadata alongside LZ4-compressed blocks.
  If your application requires interoperability, it's recommended to use it.
  A library is provided to take care of it, see lz4frame.h.

Version



int LZ4_versionNumber (void);  /**< library version number; useful to check dll version */

const char* LZ4_versionString (void);   /**< library version string; unseful to check dll version */

Tuning parameter



#ifndef LZ4_MEMORY_USAGE
# define LZ4_MEMORY_USAGE 14
#endif

Memory usage formula : N->2^N Bytes (examples : 10 -> 1KB; 12 -> 4KB ; 16 -> 64KB; 20 -> 1MB; etc.) Increasing memory usage improves compression ratio Reduced memory usage may improve speed, thanks to cache effect Default value is 14, for 16KB, which nicely fits into Intel x86 L1 cache


Simple Functions



int LZ4_compress_default(const char* src, char* dst, int srcSize, int dstCapacity);

Compresses 'srcSize' bytes from buffer 'src' into already allocated 'dst' buffer of size 'dstCapacity'. Compression is guaranteed to succeed if 'dstCapacity' >= LZ4_compressBound(srcSize). It also runs faster, so it's a recommended setting. If the function cannot compress 'src' into a more limited 'dst' budget, compression stops *immediately*, and the function result is zero. Note : as a consequence, 'dst' content is not valid. Note 2 : This function is protected against buffer overflow scenarios (never writes outside 'dst' buffer, nor read outside 'source' buffer). srcSize : max supported value is LZ4_MAX_INPUT_SIZE. dstCapacity : size of buffer 'dst' (which must be already allocated) return : the number of bytes written into buffer 'dst' (necessarily <= dstCapacity) or 0 if compression fails


int LZ4_decompress_safe (const char* src, char* dst, int compressedSize, int dstCapacity);

compressedSize : is the exact complete size of the compressed block. dstCapacity : is the size of destination buffer, which must be already allocated. return : the number of bytes decompressed into destination buffer (necessarily <= dstCapacity) If destination buffer is not large enough, decoding will stop and output an error code (negative value). If the source stream is detected malformed, the function will stop decoding and return a negative result. This function is protected against malicious data packets.


Advanced Functions



int LZ4_compressBound(int inputSize);

Provides the maximum size that LZ4 compression may output in a "worst case" scenario (input data not compressible) This function is primarily useful for memory allocation purposes (destination buffer size). Macro LZ4_COMPRESSBOUND() is also provided for compilation-time evaluation (stack memory allocation for example). Note that LZ4_compress_default() compresses faster when dstCapacity is >= LZ4_compressBound(srcSize) inputSize : max supported value is LZ4_MAX_INPUT_SIZE return : maximum output size in a "worst case" scenario or 0, if input size is incorrect (too large or negative)


int LZ4_compress_fast (const char* src, char* dst, int srcSize, int dstCapacity, int acceleration);

Same as LZ4_compress_default(), but allows selection of "acceleration" factor. The larger the acceleration value, the faster the algorithm, but also the lesser the compression. It's a trade-off. It can be fine tuned, with each successive value providing roughly +~3% to speed. An acceleration value of "1" is the same as regular LZ4_compress_default() Values <= 0 will be replaced by ACCELERATION_DEFAULT (currently == 1, see lz4.c).


int LZ4_sizeofState(void);
int LZ4_compress_fast_extState (void* state, const char* src, char* dst, int srcSize, int dstCapacity, int acceleration);

Same compression function, just using an externally allocated memory space to store compression state. Use LZ4_sizeofState() to know how much memory must be allocated, and allocate it on 8-bytes boundaries (using malloc() typically). Then, provide this buffer as 'void* state' to compression function.


int LZ4_compress_destSize (const char* src, char* dst, int* srcSizePtr, int targetDstSize);

Reverse the logic : compresses as much data as possible from 'src' buffer into already allocated buffer 'dst', of size >= 'targetDestSize'. This function either compresses the entire 'src' content into 'dst' if it's large enough, or fill 'dst' buffer completely with as much data as possible from 'src'. note: acceleration parameter is fixed to "default". *srcSizePtr : will be modified to indicate how many bytes where read from 'src' to fill 'dst'. New value is necessarily <= input value. @return : Nb bytes written into 'dst' (necessarily <= targetDestSize) or 0 if compression fails.


int LZ4_decompress_fast (const char* src, char* dst, int originalSize);

This function used to be a bit faster than LZ4_decompress_safe(), though situation has changed in recent versions, and now `LZ4_decompress_safe()` can be as fast and sometimes faster than `LZ4_decompress_fast()`. Moreover, LZ4_decompress_fast() is not protected vs malformed input, as it doesn't perform full validation of compressed data. As a consequence, this function is no longer recommended, and may be deprecated in future versions. It's only remaining specificity is that it can decompress data without knowing its compressed size. originalSize : is the uncompressed size to regenerate. `dst` must be already allocated, its size must be >= 'originalSize' bytes. @return : number of bytes read from source buffer (== compressed size). If the source stream is detected malformed, the function stops decoding and returns a negative result. note : This function requires uncompressed originalSize to be known in advance. The function never writes past the output buffer. However, since it doesn't know its 'src' size, it may read past the intended input. Also, because match offsets are not validated during decoding, reads from 'src' may underflow. Use this function in trusted environment **only**.


int LZ4_decompress_safe_partial (const char* src, char* dst, int srcSize, int targetOutputSize, int dstCapacity);

Decompress an LZ4 compressed block, of size 'srcSize' at position 'src', into destination buffer 'dst' of size 'dstCapacity'. Up to 'targetOutputSize' bytes will be decoded. The function stops decoding on reaching this objective, which can boost performance when only the beginning of a block is required. @return : the number of bytes decoded in `dst` (necessarily <= dstCapacity) If source stream is detected malformed, function returns a negative result. Note : @return can be < targetOutputSize, if compressed block contains less data. Note 2 : this function features 2 parameters, targetOutputSize and dstCapacity, and expects targetOutputSize <= dstCapacity. It effectively stops decoding on reaching targetOutputSize, so dstCapacity is kind of redundant. This is because in a previous version of this function, decoding operation would not "break" a sequence in the middle. As a consequence, there was no guarantee that decoding would stop at exactly targetOutputSize, it could write more bytes, though only up to dstCapacity. Some "margin" used to be required for this operation to work properly. This is no longer necessary. The function nonetheless keeps its signature, in an effort to not break API.


Streaming Compression Functions



LZ4_stream_t* LZ4_createStream(void);
int           LZ4_freeStream (LZ4_stream_t* streamPtr);

LZ4_createStream() will allocate and initialize an `LZ4_stream_t` structure. LZ4_freeStream() releases its memory.


void LZ4_resetStream (LZ4_stream_t* streamPtr);

An LZ4_stream_t structure can be allocated once and re-used multiple times. Use this function to start compressing a new stream.


int LZ4_loadDict (LZ4_stream_t* streamPtr, const char* dictionary, int dictSize);

Use this function to load a static dictionary into LZ4_stream_t. Any previous data will be forgotten, only 'dictionary' will remain in memory. Loading a size of 0 is allowed, and is the same as reset. @return : dictionary size, in bytes (necessarily <= 64 KB)


int LZ4_compress_fast_continue (LZ4_stream_t* streamPtr, const char* src, char* dst, int srcSize, int dstCapacity, int acceleration);

Compress 'src' content using data from previously compressed blocks, for better compression ratio. 'dst' buffer must be already allocated. If dstCapacity >= LZ4_compressBound(srcSize), compression is guaranteed to succeed, and runs faster. @return : size of compressed block or 0 if there is an error (typically, cannot fit into 'dst'). Note 1 : Each invocation to LZ4_compress_fast_continue() generates a new block. Each block has precise boundaries. It's not possible to append blocks together and expect a single invocation of LZ4_decompress_*() to decompress them together. Each block must be decompressed separately, calling LZ4_decompress_*() with associated metadata. Note 2 : The previous 64KB of source data is __assumed__ to remain present, unmodified, at same address in memory! Note 3 : When input is structured as a double-buffer, each buffer can have any size, including < 64 KB. Make sure that buffers are separated, by at least one byte. This construction ensures that each block only depends on previous block. Note 4 : If input buffer is a ring-buffer, it can have any size, including < 64 KB. Note 5 : After an error, the stream status is invalid, it can only be reset or freed.


int LZ4_saveDict (LZ4_stream_t* streamPtr, char* safeBuffer, int maxDictSize);

If last 64KB data cannot be guaranteed to remain available at its current memory location, save it into a safer place (char* safeBuffer). This is schematically equivalent to a memcpy() followed by LZ4_loadDict(), but is much faster, because LZ4_saveDict() doesn't need to rebuild tables. @return : saved dictionary size in bytes (necessarily <= maxDictSize), or 0 if error.


Streaming Decompression Functions

  Bufferless synchronous API

LZ4_streamDecode_t* LZ4_createStreamDecode(void);
int                 LZ4_freeStreamDecode (LZ4_streamDecode_t* LZ4_stream);

creation / destruction of streaming decompression tracking context. A tracking context can be re-used multiple times.


int LZ4_setStreamDecode (LZ4_streamDecode_t* LZ4_streamDecode, const char* dictionary, int dictSize);

An LZ4_streamDecode_t context can be allocated once and re-used multiple times. Use this function to start decompression of a new stream of blocks. A dictionary can optionally be set. Use NULL or size 0 for a reset order. Dictionary is presumed stable : it must remain accessible and unmodified during next decompression. @return : 1 if OK, 0 if error


int LZ4_decoderRingBufferSize(int maxBlockSize);
#define LZ4_DECODER_RING_BUFFER_SIZE(mbs) (65536 + 14 + (mbs))  /* for static allocation; mbs presumed valid */

Note : in a ring buffer scenario (optional), blocks are presumed decompressed next to each other up to the moment there is not enough remaining space for next block (remainingSize < maxBlockSize), at which stage it resumes from beginning of ring buffer. When setting such a ring buffer for streaming decompression, provides the minimum size of this ring buffer to be compatible with any source respecting maxBlockSize condition. @return : minimum ring buffer size, or 0 if there is an error (invalid maxBlockSize).


int LZ4_decompress_safe_continue (LZ4_streamDecode_t* LZ4_streamDecode, const char* src, char* dst, int srcSize, int dstCapacity);
int LZ4_decompress_fast_continue (LZ4_streamDecode_t* LZ4_streamDecode, const char* src, char* dst, int originalSize);

These decoding functions allow decompression of consecutive blocks in "streaming" mode. A block is an unsplittable entity, it must be presented entirely to a decompression function. Decompression functions only accepts one block at a time. The last 64KB of previously decoded data *must* remain available and unmodified at the memory position where they were decoded. If less than 64KB of data has been decoded, all the data must be present. Special : if decompression side sets a ring buffer, it must respect one of the following conditions : - Decompression buffer size is _at least_ LZ4_decoderRingBufferSize(maxBlockSize). maxBlockSize is the maximum size of any single block. It can have any value > 16 bytes. In which case, encoding and decoding buffers do not need to be synchronized. Actually, data can be produced by any source compliant with LZ4 format specification, and respecting maxBlockSize. - Synchronized mode : Decompression buffer size is _exactly_ the same as compression buffer size, and follows exactly same update rule (block boundaries at same positions), and decoding function is provided with exact decompressed size of each block (exception for last block of the stream), _then_ decoding & encoding ring buffer can have any size, including small ones ( < 64 KB). - Decompression buffer is larger than encoding buffer, by a minimum of maxBlockSize more bytes. In which case, encoding and decoding buffers do not need to be synchronized, and encoding ring buffer can have any size, including small ones ( < 64 KB). Whenever these conditions are not possible, save the last 64KB of decoded data into a safe buffer where it can't be modified during decompression, then indicate where this data is saved using LZ4_setStreamDecode(), before decompressing next block.


int LZ4_decompress_safe_usingDict (const char* src, char* dst, int srcSize, int dstCapcity, const char* dictStart, int dictSize);
int LZ4_decompress_fast_usingDict (const char* src, char* dst, int originalSize, const char* dictStart, int dictSize);

These decoding functions work the same as a combination of LZ4_setStreamDecode() followed by LZ4_decompress_*_continue() They are stand-alone, and don't need an LZ4_streamDecode_t structure. Dictionary is presumed stable : it must remain accessible and unmodified during next decompression.


Unstable declarations

 Declarations in this section should be considered unstable.
 Use at your own peril, etc., etc.
 They may be removed in the future.
 Their signatures may change.

void LZ4_resetStream_fast (LZ4_stream_t* streamPtr);

Use this, like LZ4_resetStream(), to prepare a context for a new chain of calls to a streaming API (e.g., LZ4_compress_fast_continue()). Note: Using this in advance of a non- streaming-compression function is redundant, and potentially bad for performance, since they all perform their own custom reset internally. Differences from LZ4_resetStream(): When an LZ4_stream_t is known to be in a internally coherent state, it can often be prepared for a new compression with almost no work, only sometimes falling back to the full, expensive reset that is always required when the stream is in an indeterminate state (i.e., the reset performed by LZ4_resetStream()). LZ4_streams are guaranteed to be in a valid state when: - returned from LZ4_createStream() - reset by LZ4_resetStream() - memset(stream, 0, sizeof(LZ4_stream_t)), though this is discouraged - the stream was in a valid state and was reset by LZ4_resetStream_fast() - the stream was in a valid state and was then used in any compression call that returned success - the stream was in an indeterminate state and was used in a compression call that fully reset the state (e.g., LZ4_compress_fast_extState()) and that returned success When a stream isn't known to be in a valid state, it is not safe to pass to any fastReset or streaming function. It must first be cleansed by the full LZ4_resetStream().


int LZ4_compress_fast_extState_fastReset (void* state, const char* src, char* dst, int srcSize, int dstCapacity, int acceleration);

A variant of LZ4_compress_fast_extState(). Using this variant avoids an expensive initialization step. It is only safe to call if the state buffer is known to be correctly initialized already (see above comment on LZ4_resetStream_fast() for a definition of "correctly initialized"). From a high level, the difference is that this function initializes the provided state with a call to something like LZ4_resetStream_fast() while LZ4_compress_fast_extState() starts with a call to LZ4_resetStream().


void LZ4_attach_dictionary(LZ4_stream_t *working_stream, const LZ4_stream_t *dictionary_stream);

This is an experimental API that allows for the efficient use of a static dictionary many times. Rather than re-loading the dictionary buffer into a working context before each compression, or copying a pre-loaded dictionary's LZ4_stream_t into a working LZ4_stream_t, this function introduces a no-copy setup mechanism, in which the working stream references the dictionary stream in-place. Several assumptions are made about the state of the dictionary stream. Currently, only streams which have been prepared by LZ4_loadDict() should be expected to work. Alternatively, the provided dictionary stream pointer may be NULL, in which case any existing dictionary stream is unset. If a dictionary is provided, it replaces any pre-existing stream history. The dictionary contents are the only history that can be referenced and logically immediately precede the data compressed in the first subsequent compression call. The dictionary will only remain attached to the working stream through the first compression call, at the end of which it is cleared. The dictionary stream (and source buffer) must remain in-place / accessible / unchanged through the completion of the first compression call on the stream.


Private definitions

 Do not use these definitions.
 They are exposed to allow static allocation of `LZ4_stream_t` and `LZ4_streamDecode_t`.
 Using these definitions will expose code to API and/or ABI break in future versions of the library.

typedef struct {
    const uint8_t* externalDict;
    size_t extDictSize;
    const uint8_t* prefixEnd;
    size_t prefixSize;
} LZ4_streamDecode_t_internal;

typedef struct {
    const unsigned char* externalDict;
    size_t extDictSize;
    const unsigned char* prefixEnd;
    size_t prefixSize;
} LZ4_streamDecode_t_internal;

#define LZ4_STREAMSIZE_U64 ((1 << (LZ4_MEMORY_USAGE-3)) + 4)
#define LZ4_STREAMSIZE     (LZ4_STREAMSIZE_U64 * sizeof(unsigned long long))
union LZ4_stream_u {
    unsigned long long table[LZ4_STREAMSIZE_U64];
    LZ4_stream_t_internal internal_donotuse;
} ;  /* previously typedef'd to LZ4_stream_t */

information structure to track an LZ4 stream. init this structure before first use. note : only use in association with static linking ! this definition is not API/ABI safe, it may change in a future version !


#define LZ4_STREAMDECODESIZE_U64  4
#define LZ4_STREAMDECODESIZE     (LZ4_STREAMDECODESIZE_U64 * sizeof(unsigned long long))
union LZ4_streamDecode_u {
    unsigned long long table[LZ4_STREAMDECODESIZE_U64];
    LZ4_streamDecode_t_internal internal_donotuse;
} ;   /* previously typedef'd to LZ4_streamDecode_t */

information structure to track an LZ4 stream during decompression. init this structure using LZ4_setStreamDecode (or memset()) before first use note : only use in association with static linking ! this definition is not API/ABI safe, and may change in a future version !


Obsolete Functions



#ifdef LZ4_DISABLE_DEPRECATE_WARNINGS
#  define LZ4_DEPRECATED(message)   /* disable deprecation warnings */
#else
#  define LZ4_GCC_VERSION (__GNUC__ * 100 + __GNUC_MINOR__)
#  if defined (__cplusplus) && (__cplusplus >= 201402) /* C++14 or greater */
#    define LZ4_DEPRECATED(message) [[deprecated(message)]]
#  elif (LZ4_GCC_VERSION >= 405) || defined(__clang__)
#    define LZ4_DEPRECATED(message) __attribute__((deprecated(message)))
#  elif (LZ4_GCC_VERSION >= 301)
#    define LZ4_DEPRECATED(message) __attribute__((deprecated))
#  elif defined(_MSC_VER)
#    define LZ4_DEPRECATED(message) __declspec(deprecated(message))
#  else
#    pragma message("WARNING: You need to implement LZ4_DEPRECATED for this compiler")
#    define LZ4_DEPRECATED(message)
#  endif
#endif /* LZ4_DISABLE_DEPRECATE_WARNINGS */

Should deprecation warnings be a problem, it is generally possible to disable them, typically with -Wno-deprecated-declarations for gcc or _CRT_SECURE_NO_WARNINGS in Visual. Otherwise, it's also possible to define LZ4_DISABLE_DEPRECATE_WARNINGS