4 files changed, 337 insertions, 175 deletions
diff --git a/doc/lz4_Block_format.md b/doc/lz4_Block_format.md
index 4344e9b..9e80227 100644
--- a/doc/lz4_Block_format.md
+++ b/doc/lz4_Block_format.md
@@ -1,24 +1,22 @@
 LZ4 Block Format Description
 ============================
-Last revised: 2019-03-30.
+Last revised: 2022-07-31 .
 Author : Yann Collet
 
 
-This specification is intended for developers
-willing to produce LZ4-compatible compressed data blocks
-using any programming language.
+This specification is intended for developers willing to
+produce or read LZ4 compressed data blocks
+using any programming language of their choice.
 
-LZ4 is an LZ77-type compressor with a fixed, byte-oriented encoding.
+LZ4 is an LZ77-type compressor with a fixed byte-oriented encoding format.
 There is no entropy encoder back-end nor framing layer.
 The latter is assumed to be handled by other parts of the system
 (see [LZ4 Frame format]).
 This design is assumed to favor simplicity and speed.
-It helps later on for optimizations, compactness, and features.
 
-This document describes only the block format,
+This document describes only the Block Format,
 not how the compressor nor decompressor actually work.
-The correctness of the decompressor should not depend
-on implementation details of the compressor, and vice versa.
+For more details on such topics, see later section "Implementation Notes".
 
 [LZ4 Frame format]: lz4_Frame_format.md
 
@@ -28,7 +26,7 @@ Compressed block format
 -----------------------
 An LZ4 compressed block is composed of sequences.
 A sequence is a suite of literals (not-compressed bytes),
-followed by a match copy.
+followed by a match copy operation.
 
 Each sequence starts with a `token`.
 The `token` is a one byte value, separated into two 4-bits fields.
@@ -38,13 +36,20 @@ Therefore each field ranges from 0 to 15.
 The first field uses the 4 high-bits of the token.
 It provides the length of literals to follow.
 
-If the field value is 0, then there is no literal.
-If it is 15, then we need to add some more bytes to indicate the full length.
-Each additional byte then represent a value from 0 to 255,
+If the field value is smaller than 15,
+then it represents the total nb of literals present in the sequence,
+including 0, in which case there is no literal.
+
+The value 15 is a special case: more bytes are required to indicate the full length.
+Each additional byte then represents a value from 0 to 255,
 which is added to the previous value to produce a total length.
-When the byte value is 255, another byte is output.
-There can be any number of bytes following `token`. There is no "size limit".
-(Side note : this is why a not-compressible input block is expanded by 0.4%).
+When the byte value is 255, another byte must be read and added, and so on.
+There can be any number of bytes of value `255` following `token`.
+The Block Format does not define any "size limit",
+though real implementations may feature some practical limits
+(see more details in later chapter "Implementation Notes").
+
+Note : this format explains why a non-compressible input block is expanded by 0.4%.
 
 Example 1 : A literal length of 48 will be represented as :
 
@@ -55,7 +60,7 @@ Example 2 : A literal length of 280 will be represented as :
 
   - 15  : value for the 4-bits High field
   - 255 : following byte is maxed, since 280-15 >= 255
-  - 10  : (=280 - 15 - 255) ) remaining length to reach 280
+  - 10  : (=280 - 15 - 255) remaining length to reach 280
 
 Example 3 : A literal length of 15 will be represented as :
 
@@ -63,94 +68,177 @@ Example 3 : A literal length of 15 will be represented as :
   - 0  : (=15-15) yes, the zero must be output
 
 Following `token` and optional length bytes, are the literals themselves.
-They are exactly as numerous as previously decoded (length of literals).
-It's possible that there are zero literal.
+They are exactly as numerous as just decoded (length of literals).
+Reminder: it's possible that there are zero literals.
 
 
 Following the literals is the match copy operation.
 
-It starts by the `offset`.
+It starts by the `offset` value.
 This is a 2 bytes value, in little endian format
 (the 1st byte is the "low" byte, the 2nd one is the "high" byte).
 
-The `offset` represents the position of the match to be copied from.
-1 means "current position - 1 byte".
-The maximum `offset` value is 65535, 65536 cannot be coded.
-Note that 0 is an invalid value, not used.
+The `offset` represents the position of the match to be copied from the past.
+For example, 1 means "current position - 1 byte".
+The maximum `offset` value is 65535. 65536 and beyond cannot be coded.
+Note that 0 is an invalid `offset` value.
+The presence of a 0 `offset` value denotes an invalid (corrupted) block.
 
-Then we need to extract the `matchlength`.
-For this, we use the second token field, the low 4-bits.
-Value, obviously, ranges from 0 to 15.
-However here, 0 means that the copy operation will be minimal.
+Then the `matchlength` can be extracted.
+For this, we use the second `token` field, the low 4-bits.
+Such a value, obviously, ranges from 0 to 15.
+However here, 0 means that the copy operation is minimal.
 The minimum length of a match, called `minmatch`, is 4.
-As a consequence, a 0 value means 4 bytes, and a value of 15 means 19+ bytes.
-Similar to literal length, on reaching the highest possible value (15),
-we output additional bytes, one at a time, with values ranging from 0 to 255.
+As a consequence, a 0 value means 4 bytes.
+Similarly to literal length, any value smaller than 15 represents a length,
+to which 4 (`minmatch`) must be added, thus ranging from 4 to 18.
+A value of 15 is special, meaning 19+ bytes,
+to which one must read additional bytes, one at a time,
+with each byte value ranging from 0 to 255.
 They are added to total to provide the final match length.
 A 255 value means there is another byte to read and add.
-There is no limit to the number of optional bytes that can be output this way.
-(This points towards a maximum achievable compression ratio of about 250).
+There is no limit to the number of optional `255` bytes that can be present,
+and therefore no limit to representable match length,
+though real-life implementations are likely going to enforce limits for practical reasons (see more details in "Implementation Notes" section below).
+
+Note: this format has a maximum achievable compression ratio of about ~250.
 
 Decoding the `matchlength` reaches the end of current sequence.
-Next byte will be the start of another sequence.
-But before moving to next sequence,
-it's time to use the decoded match position and length.
-The decoder copies `matchlength` bytes from match position to current position.
-
-In some cases, `matchlength` is larger than `offset`.
-Therefore, `match_pos + matchlength > current_pos`,
-which means that later bytes to copy are not yet decoded.
-This is called an "overlap match", and must be handled with special care.
-A common case is an offset of 1,
-meaning the last byte is repeated `matchlength` times.
+Next byte will be the start of another sequence, and therefore a new `token`.
 
 
-End of block restrictions
------------------------
-There are specific rules required to terminate a block.
+End of block conditions
+-------------------------
+There are specific restrictions required to terminate an LZ4 block.
 
 1. The last sequence contains only literals.
-   The block ends right after them.
+   The block ends right after the literals (no `offset` field).
 2. The last 5 bytes of input are always literals.
    Therefore, the last sequence contains at least 5 bytes.
    - Special : if input is smaller than 5 bytes,
      there is only one sequence, it contains the whole input as literals.
-     Empty input can be represented with a zero byte,
+     Even empty input can be represented, using a zero byte,
      interpreted as a final token without literal and without a match.
 3. The last match must start at least 12 bytes before the end of block.
-   The last match is part of the penultimate sequence.
-   It is followed by the last sequence, which contains only literals.
+   The last match is part of the _penultimate_ sequence.
+   It is followed by the last sequence, which contains _only_ literals.
    - Note that, as a consequence,
-     an independent block < 13 bytes cannot be compressed,
-     because the match must copy "something",
-     so it needs at least one prior byte.
-   - When a block can reference data from another block,
-     it can start immediately with a match and no literal,
-     so a block of 12 bytes can be compressed.
+     blocks < 12 bytes cannot be compressed.
+     And as an extension, _independent_ blocks < 13 bytes cannot be compressed,
+     because they must start by at least one literal,
+     that the match can then copy afterwards.
 
 When a block does not respect these end conditions,
 a conformant decoder is allowed to reject the block as incorrect.
 
-These rules are in place to ensure that a conformant decoder
-can be designed for speed, issuing speculatively instructions,
-while never reading nor writing beyond provided I/O buffers.
+These rules are in place to ensure compatibility with
+a wide range of historical decoders
+which rely on these conditions for their speed-oriented design.
 
-
-Additional notes
+Implementation notes
 -----------------------
-If the decoder will decompress data from an external source,
-it is recommended to ensure that the decoder will not be vulnerable to
-buffer overflow manipulations.
+The LZ4 Block Format only defines the compressed format,
+it does not tell how to create a decoder or an encoder,
+which design is left free to the imagination of the implementer.
+
+However, thanks to experience, there are a number of typical topics that
+most implementations will have to consider.
+This section tries to provide a few guidelines.
+
+#### Metadata
+
+An LZ4-compressed Block requires additional metadata for proper decoding.
+Typically, a decoder will require the compressed block's size,
+and an upper bound of decompressed size.
+Other variants exist, such as knowing the decompressed size,
+and having an upper bound of the input size.
+The Block Format does not specify how to transmit such information,
+which is considered an out-of-band information channel.
+That's because in many cases, the information is present in the environment.
+For example, databases must store the size of their compressed block for indexing,
+and know that their decompressed block can't be larger than a certain threshold.
+
+If you need a format which is "self-contained",
+and also transports the necessary metadata for proper decoding on any platform,
+consider employing the [LZ4 Frame format] instead.
+
+#### Large lengths
+
+While the Block Format does not define any maximum value for length fields,
+in practice, most implementations will feature some form of limit,
+since it's expected for such values to be stored into registers of fixed bit width.
+
+If length fields use 64-bit registers,
+then it can be assumed that there is no practical limit,
+as it would require a single continuous block of multiple petabytes to reach it,
+which is unreasonable by today's standard.
+
+If length fields use 32-bit registers, then it can be overflowed,
+but requires a compressed block of size > 16 MB.
+Therefore, implementations that do not deal with compressed blocks > 16 MB are safe.
+However, if such a case is allowed,
+then it's recommended to check that no large length overflows the register.
+
+If length fields use 16-bit registers,
+then it's definitely possible to overflow such register,
+with less than < 300 bytes of compressed data.
+
+A conformant decoder should be able to detect length overflows when it's possible,
+and simply error out when that happens.
+The input block might not be invalid,
+it's just not decodable by the local decoder implementation.
+
+Note that, in order to be compatible with the larger LZ4 ecosystem,
+it's recommended to be able to read and represent lengths of up to 4 MB,
+and to accept blocks of size up to 4 MB.
+Such limits are compatible with 32-bit length registers,
+and prevent overflow of 32-bit registers.
+
+#### Safe decoding
+
+If a decoder receives compressed data from any external source,
+it is recommended to ensure that the decoder is resilient to corrupted input,
+and made safe from buffer overflow manipulations.
 Always ensure that read and write operations
 remain within the limits of provided buffers.
-Test the decoder with fuzzers
-to ensure it's resilient to improbable combinations.
 
-The format makes no assumption nor limits to the way the compressor
+Of particular importance, ensure that the nb of bytes instructed to copy
+does not overflow neither the input nor the output buffers.
+Ensure also, when reading an offset value, that the resulting position to copy
+does not reach beyond the beginning of the buffer.
+Such a situation can happen during the first 64 KB of decoded data.
+
+For more safety, test the decoder with fuzzers
+to ensure it's resilient to improbable sequences of conditions.
+Combine them with sanitizers, in order to catch overflows (asan)
+or initialization issues (msan).
+
+Pay some attention to offset 0 scenario, which is invalid,
+and therefore must not be blindly decoded:
+a naive implementation could preserve destination buffer content,
+which could then result in information disclosure
+if such buffer was uninitialized and still containing private data.
+For reference, in such a scenario, the reference LZ4 decoder
+clears the match segment with `0` bytes,
+though other solutions are certainly possible.
+
+Finally, pay attention to the "overlap match" scenario,
+when `matchlength` is larger than `offset`.
+In which case, since `match_pos + matchlength > current_pos`,
+some of the later bytes to copy do not exist yet,
+and will be generated during the early stage of match copy operation.
+Such scenario must be handled with special care.
+A common case is an offset of 1,
+meaning the last byte is repeated `matchlength` times.
+
+#### Compression techniques
+
+The core of a LZ4 compressor is to detect duplicated data across past 64 KB.
+The format makes no assumption nor limits to the way a compressor
 searches and selects matches within the source data block.
-Multiple techniques can be considered,
-featuring distinct time / performance trade offs.
-As long as the format is respected,
-the result will be compatible and decodable by any compliant decoder.
-An upper compression limit can be reached,
-using a technique called "full optimal parsing", at high cpu cost.
+For example, an upper compression limit can be reached,
+using a technique called "full optimal parsing", at high cpu and memory cost.
+But multiple other techniques can be considered,
+featuring distinct time / performance trade-offs.
+As long as the specified format is respected,
+the result will be compatible with and decodable by any compliant decoder.
diff --git a/doc/lz4_Frame_format.md b/doc/lz4_Frame_format.md
index 7e08841..97a2cbe 100644
--- a/doc/lz4_Frame_format.md
+++ b/doc/lz4_Frame_format.md
@@ -3,11 +3,11 @@ LZ4 Frame Format Description
 
 ### Notices
 
-Copyright (c) 2013-2015 Yann Collet
+Copyright (c) 2013-2020 Yann Collet
 
 Permission is granted to copy and distribute this document
-for any  purpose and without charge,
-including translations into other  languages
+for any purpose and without charge,
+including translations into other languages
 and incorporation into compilations,
 provided that the copyright notice and this notice are preserved,
 and that any substantive changes or deletions from the original
@@ -47,7 +47,7 @@ at the level of bits and other primitive data representations.
 Unless otherwise indicated below,
 a compliant compressor must produce data sets
 that conform to the specifications presented here.
-It doesn’t need to support all options though.
+It doesn't need to support all options though.
 
 A compliant decompressor must be able to decompress
 at least one working set of parameters
@@ -244,8 +244,7 @@ One-byte checksum of combined descriptor fields, including optional ones.
 The value is the second byte of `xxh32()` : ` (xxh32()>>8) & 0xFF `
 using zero as a seed, and the full Frame Descriptor as an input
 (including optional fields when they are present).
-A wrong checksum indicates an error in the descriptor.
-Header checksum is informational and can be skipped.
+A wrong checksum indicates that the descriptor is erroneous.
 
 
 Data Blocks
@@ -385,7 +384,7 @@ __EndMark__
 
 End of legacy frame is implicit only.
 It must be followed by a standard EOF (End Of File) signal,
-wether it is a file or a stream.
+whether it is a file or a stream.
 
 Alternatively, if the frame is followed by a valid Frame Magic Number,
 it is considered completed.
diff --git a/doc/lz4_manual.html b/doc/lz4_manual.html
index 47fe18d..6fafb21 100644
--- a/doc/lz4_manual.html
+++ b/doc/lz4_manual.html
@@ -1,10 +1,10 @@
 <html>
 <head>
 <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
-<title>1.9.3 Manual</title>
+<title>1.9.4 Manual</title>
 </head>
 <body>
-<h1>1.9.3 Manual</h1>
+<h1>1.9.4 Manual</h1>
 <hr>
 <a name="Contents"></a><h2>Contents</h2>
 <ol>
@@ -48,20 +48,49 @@
   The `lz4` CLI can only manage frames.
 <BR></pre>
 
+<pre><b>#if defined(LZ4_FREESTANDING) && (LZ4_FREESTANDING == 1)
+#  define LZ4_HEAPMODE 0
+#  define LZ4HC_HEAPMODE 0
+#  define LZ4_STATIC_LINKING_ONLY_DISABLE_MEMORY_ALLOCATION 1
+#  if !defined(LZ4_memcpy)
+#    error "LZ4_FREESTANDING requires macro 'LZ4_memcpy'."
+#  endif
+#  if !defined(LZ4_memset)
+#    error "LZ4_FREESTANDING requires macro 'LZ4_memset'."
+#  endif
+#  if !defined(LZ4_memmove)
+#    error "LZ4_FREESTANDING requires macro 'LZ4_memmove'."
+#  endif
+#elif ! defined(LZ4_FREESTANDING)
+#  define LZ4_FREESTANDING 0
+#endif
+</b><p>  When this macro is set to 1, it enables "freestanding mode" that is
+  suitable for typical freestanding environment which doesn't support
+  standard C library.
+
+  - LZ4_FREESTANDING is a compile-time switch.
+  - It requires the following macros to be defined:
+    LZ4_memcpy, LZ4_memmove, LZ4_memset.
+  - It only enables LZ4/HC functions which don't use heap.
+    All LZ4F_* functions are not supported.
+  - See tests/freestanding.c to check its basic setup.
+ 
+</p></pre><BR>
+
 <a name="Chapter2"></a><h2>Version</h2><pre></pre>
 
-<pre><b>int LZ4_versionNumber (void);  </b>/**< library version number; useful to check dll version */<b>
+<pre><b>int LZ4_versionNumber (void);  </b>/**< library version number; useful to check dll version; requires v1.3.0+ */<b>
 </b></pre><BR>
-<pre><b>const char* LZ4_versionString (void);   </b>/**< library version string; useful to check dll version */<b>
+<pre><b>const char* LZ4_versionString (void);   </b>/**< library version string; useful to check dll version; requires v1.7.5+ */<b>
 </b></pre><BR>
 <a name="Chapter3"></a><h2>Tuning parameter</h2><pre></pre>
 
 <pre><b>#ifndef LZ4_MEMORY_USAGE
-# define LZ4_MEMORY_USAGE 14
+# define LZ4_MEMORY_USAGE LZ4_MEMORY_USAGE_DEFAULT
 #endif
-</b><p> Memory usage formula : N->2^N Bytes (examples : 10 -> 1KB; 12 -> 4KB ; 16 -> 64KB; 20 -> 1MB; etc.)
- Increasing memory usage improves compression ratio.
- Reduced memory usage may improve speed, thanks to better cache locality.
+</b><p> Memory usage formula : N->2^N Bytes (examples : 10 -> 1KB; 12 -> 4KB ; 16 -> 64KB; 20 -> 1MB; )
+ Increasing memory usage improves compression ratio, at the cost of speed.
+ Reduced memory usage may improve speed at the cost of ratio, thanks to better cache locality.
  Default value is 14, for 16KB, which nicely fits into Intel x86 L1 cache
  
 </p></pre><BR>
@@ -267,8 +296,10 @@ int LZ4_compress_fast_extState (void* state, const char* src, char* dst, int src
 <a name="Chapter7"></a><h2>Streaming Decompression Functions</h2><pre>  Bufferless synchronous API
 <BR></pre>
 
-<pre><b>LZ4_streamDecode_t* LZ4_createStreamDecode(void);
+<pre><b>#if !defined(LZ4_STATIC_LINKING_ONLY_DISABLE_MEMORY_ALLOCATION)
+LZ4_streamDecode_t* LZ4_createStreamDecode(void);
 int                 LZ4_freeStreamDecode (LZ4_streamDecode_t* LZ4_stream);
+#endif </b>/* !defined(LZ4_STATIC_LINKING_ONLY_DISABLE_MEMORY_ALLOCATION) */<b>
 </b><p>  creation / destruction of streaming decompression tracking context.
   A tracking context can be re-used multiple times.
  
@@ -297,7 +328,10 @@ int                 LZ4_freeStreamDecode (LZ4_streamDecode_t* LZ4_stream);
  
 </p></pre><BR>
 
-<pre><b>int LZ4_decompress_safe_continue (LZ4_streamDecode_t* LZ4_streamDecode, const char* src, char* dst, int srcSize, int dstCapacity);
+<pre><b>int
+LZ4_decompress_safe_continue (LZ4_streamDecode_t* LZ4_streamDecode,
+                        const char* src, char* dst,
+                        int srcSize, int dstCapacity);
 </b><p>  These decoding functions allow decompression of consecutive blocks in "streaming" mode.
   A block is an unsplittable entity, it must be presented entirely to a decompression function.
   Decompression functions only accepts one block at a time.
@@ -323,7 +357,10 @@ int                 LZ4_freeStreamDecode (LZ4_streamDecode_t* LZ4_stream);
   then indicate where this data is saved using LZ4_setStreamDecode(), before decompressing next block.
 </p></pre><BR>
 
-<pre><b>int LZ4_decompress_safe_usingDict (const char* src, char* dst, int srcSize, int dstCapcity, const char* dictStart, int dictSize);
+<pre><b>int
+LZ4_decompress_safe_usingDict(const char* src, char* dst,
+                              int srcSize, int dstCapacity,
+                              const char* dictStart, int dictSize);
 </b><p>  These decoding functions work the same as
   a combination of LZ4_setStreamDecode() followed by LZ4_decompress_*_continue()
   They are stand-alone, and don't need an LZ4_streamDecode_t structure.
@@ -363,7 +400,9 @@ int                 LZ4_freeStreamDecode (LZ4_streamDecode_t* LZ4_stream);
  
 </p></pre><BR>
 
-<pre><b>LZ4LIB_STATIC_API void LZ4_attach_dictionary(LZ4_stream_t* workingStream, const LZ4_stream_t* dictionaryStream);
+<pre><b>LZ4LIB_STATIC_API void
+LZ4_attach_dictionary(LZ4_stream_t* workingStream,
+                const LZ4_stream_t* dictionaryStream);
 </b><p>  This is an experimental API that allows
   efficient use of a static dictionary many times.
 
@@ -393,7 +432,7 @@ int                 LZ4_freeStreamDecode (LZ4_streamDecode_t* LZ4_stream);
 
 <pre><b></b><p>
  It's possible to have input and output sharing the same buffer,
- for highly contrained memory environments.
+ for highly constrained memory environments.
  In both cases, it requires input to lay at the end of the buffer,
  and decompression to start at beginning of the buffer.
  Buffer size must feature some margin, hence be larger than final size.
@@ -452,28 +491,9 @@ int                 LZ4_freeStreamDecode (LZ4_streamDecode_t* LZ4_stream);
  Accessing members will expose user code to API and/or ABI break in future versions of the library.
 <BR></pre>
 
-<pre><b>typedef struct {
-    const LZ4_byte* externalDict;
-    size_t extDictSize;
-    const LZ4_byte* prefixEnd;
-    size_t prefixSize;
-} LZ4_streamDecode_t_internal;
-</b></pre><BR>
-<pre><b>#define LZ4_STREAMSIZE       16416  </b>/* static size, for inter-version compatibility */<b>
-#define LZ4_STREAMSIZE_VOIDP (LZ4_STREAMSIZE / sizeof(void*))
-union LZ4_stream_u {
-    void* table[LZ4_STREAMSIZE_VOIDP];
-    LZ4_stream_t_internal internal_donotuse;
-}; </b>/* previously typedef'd to LZ4_stream_t */<b>
-</b><p>  Do not use below internal definitions directly !
-  Declare or allocate an LZ4_stream_t instead.
-  LZ4_stream_t can also be created using LZ4_createStream(), which is recommended.
-  The structure definition can be convenient for static allocation
-  (on stack, or as part of larger structure).
-  Init this structure with LZ4_initStream() before first use.
-  note : only use this definition in association with static linking !
-  this definition is not API/ABI safe, and may change in future versions.
- 
+<pre><b></b><p>  Never ever use below internal definitions directly !
+  These definitions are not API/ABI safe, and may change in future versions.
+  If you need static allocation, declare or allocate an LZ4_stream_t object.
 </p></pre><BR>
 
 <pre><b>LZ4_stream_t* LZ4_initStream (void* buffer, size_t size);
@@ -489,21 +509,17 @@ union LZ4_stream_u {
          In which case, the function will @return NULL.
   Note2: An LZ4_stream_t structure guarantees correct alignment and size.
   Note3: Before v1.9.0, use LZ4_resetStream() instead
- 
 </p></pre><BR>
 
-<pre><b>#define LZ4_STREAMDECODESIZE_U64 (4 + ((sizeof(void*)==16) ? 2 : 0) </b>/*AS-400*/ )<b>
-#define LZ4_STREAMDECODESIZE     (LZ4_STREAMDECODESIZE_U64 * sizeof(unsigned long long))
-union LZ4_streamDecode_u {
-    unsigned long long table[LZ4_STREAMDECODESIZE_U64];
-    LZ4_streamDecode_t_internal internal_donotuse;
-} ;   </b>/* previously typedef'd to LZ4_streamDecode_t */<b>
-</b><p>  information structure to track an LZ4 stream during decompression.
-  init this structure  using LZ4_setStreamDecode() before first use.
-  note : only use in association with static linking !
-         this definition is not API/ABI safe,
-         and may change in a future version !
- 
+<pre><b>typedef struct {
+    const LZ4_byte* externalDict;
+    const LZ4_byte* prefixEnd;
+    size_t extDictSize;
+    size_t prefixSize;
+} LZ4_streamDecode_t_internal;
+</b><p>  Never ever use below internal definitions directly !
+  These definitions are not API/ABI safe, and may change in future versions.
+  If you need static allocation, declare or allocate an LZ4_streamDecode_t object.
 </p></pre><BR>
 
 <a name="Chapter10"></a><h2>Obsolete Functions</h2><pre></pre>
diff --git a/doc/lz4frame_manual.html b/doc/lz4frame_manual.html
index 2758306..cfb437e 100644
--- a/doc/lz4frame_manual.html
+++ b/doc/lz4frame_manual.html
@@ -1,10 +1,10 @@
 <html>
 <head>
 <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
-<title>1.9.3 Manual</title>
+<title>1.9.4 Manual</title>
 </head>
 <body>
-<h1>1.9.3 Manual</h1>
+<h1>1.9.4 Manual</h1>
 <hr>
 <a name="Contents"></a><h2>Contents</h2>
 <ol>
@@ -22,9 +22,9 @@
 </ol>
 <hr>
 <a name="Chapter1"></a><h2>Introduction</h2><pre>
-  lz4frame.h implements LZ4 frame specification (doc/lz4_Frame_format.md).
-  lz4frame.h provides frame compression functions that take care
-  of encoding standard metadata alongside LZ4-compressed blocks.
+ lz4frame.h implements LZ4 frame specification: see doc/lz4_Frame_format.md .
+ LZ4 Frames are compatible with `lz4` CLI,
+ and designed to be interoperable with any system.
 <BR></pre>
 
 <a name="Chapter2"></a><h2>Compiler specifics</h2><pre></pre>
@@ -35,7 +35,8 @@
 </b></pre><BR>
 <pre><b>const char* LZ4F_getErrorName(LZ4F_errorCode_t code);   </b>/**< return error code string; for debugging */<b>
 </b></pre><BR>
-<a name="Chapter4"></a><h2>Frame compression types</h2><pre></pre>
+<a name="Chapter4"></a><h2>Frame compression types</h2><pre> 
+<BR></pre>
 
 <pre><b>typedef enum {
     LZ4F_default=0,
@@ -108,7 +109,7 @@
 </b><p>  Returns the maximum possible compressed size with LZ4F_compressFrame() given srcSize and preferences.
  `preferencesPtr` is optional. It can be replaced by NULL, in which case, the function will assume default preferences.
   Note : this result is only usable with LZ4F_compressFrame().
-         It may also be used with LZ4F_compressUpdate() _if no flush() operation_ is performed.
+         It may also be relevant to LZ4F_compressUpdate() _only if_ no flush() operation is ever performed.
  
 </p></pre><BR>
 
@@ -134,13 +135,19 @@
 
 <pre><b>LZ4F_errorCode_t LZ4F_createCompressionContext(LZ4F_cctx** cctxPtr, unsigned version);
 LZ4F_errorCode_t LZ4F_freeCompressionContext(LZ4F_cctx* cctx);
-</b><p> The first thing to do is to create a compressionContext object, which will be used in all compression operations.
- This is achieved using LZ4F_createCompressionContext(), which takes as argument a version.
- The version provided MUST be LZ4F_VERSION. It is intended to track potential version mismatch, notably when using DLL.
- The function will provide a pointer to a fully allocated LZ4F_cctx object.
- If @return != zero, there was an error during context creation.
- Object can release its memory using LZ4F_freeCompressionContext();
- 
+</b><p>  The first thing to do is to create a compressionContext object,
+  which will keep track of operation state during streaming compression.
+  This is achieved using LZ4F_createCompressionContext(), which takes as argument a version,
+  and a pointer to LZ4F_cctx*, to write the resulting pointer into.
+  @version provided MUST be LZ4F_VERSION. It is intended to track potential version mismatch, notably when using DLL.
+  The function provides a pointer to a fully allocated LZ4F_cctx object.
+  @cctxPtr MUST be != NULL.
+  If @return != zero, context creation failed.
+  A created compression context can be employed multiple times for consecutive streaming operations.
+  Once all streaming compression jobs are completed,
+  the state object can be released using LZ4F_freeCompressionContext().
+  Note1 : LZ4F_freeCompressionContext() is always successful. Its return value can be ignored.
+  Note2 : LZ4F_freeCompressionContext() works fine with NULL input pointers (do nothing).
 </p></pre><BR>
 
 <a name="Chapter8"></a><h2>Compression</h2><pre></pre>
@@ -181,8 +188,9 @@ LZ4F_errorCode_t LZ4F_freeCompressionContext(LZ4F_cctx* cctx);
   Important rule: dstCapacity MUST be large enough to ensure operation success even in worst case situations.
   This value is provided by LZ4F_compressBound().
   If this condition is not respected, LZ4F_compress() will fail (result is an errorCode).
-  LZ4F_compressUpdate() doesn't guarantee error recovery.
-  When an error occurs, compression context must be freed or resized.
+  After an error, the state is left in a UB state, and must be re-initialized or freed.
+  If previously an uncompressed block was written, buffered data is flushed
+  before appending compressed data is continued.
  `cOptPtr` is optional : NULL can be provided, in which case all options are set to default.
  @return : number of bytes written into `dstBuffer` (it can be zero, meaning input data was just buffered).
            or an error code if it fails (which can be tested using LZ4F_isError())
@@ -219,16 +227,21 @@ LZ4F_errorCode_t LZ4F_freeCompressionContext(LZ4F_cctx* cctx);
 <a name="Chapter9"></a><h2>Decompression functions</h2><pre></pre>
 
 <pre><b>typedef struct {
-  unsigned stableDst;    </b>/* pledges that last 64KB decompressed data will remain available unmodified. This optimization skips storage operations in tmp buffers. */<b>
-  unsigned reserved[3];  </b>/* must be set to zero for forward compatibility */<b>
+  unsigned stableDst;     /* pledges that last 64KB decompressed data will remain available unmodified between invocations.
+                           * This optimization skips storage operations in tmp buffers. */
+  unsigned skipChecksums; /* disable checksum calculation and verification, even when one is present in frame, to save CPU time.
+                           * Setting this option to 1 once disables all checksums for the rest of the frame. */
+  unsigned reserved1;     </b>/* must be set to zero for forward compatibility */<b>
+  unsigned reserved0;     </b>/* idem */<b>
 } LZ4F_decompressOptions_t;
 </b></pre><BR>
 <pre><b>LZ4F_errorCode_t LZ4F_createDecompressionContext(LZ4F_dctx** dctxPtr, unsigned version);
 LZ4F_errorCode_t LZ4F_freeDecompressionContext(LZ4F_dctx* dctx);
 </b><p>  Create an LZ4F_dctx object, to track all decompression operations.
-  The version provided MUST be LZ4F_VERSION.
-  The function provides a pointer to an allocated and initialized LZ4F_dctx object.
-  The result is an errorCode, which can be tested using LZ4F_isError().
+  @version provided MUST be LZ4F_VERSION.
+  @dctxPtr MUST be valid.
+  The function fills @dctxPtr with the value of a pointer to an allocated and initialized LZ4F_dctx object.
+  The @return is an errorCode, which can be tested using LZ4F_isError().
   dctx memory can be released using LZ4F_freeDecompressionContext();
   Result of LZ4F_freeDecompressionContext() indicates current state of decompressionContext when being released.
   That is, it should be == 0 if decompression has been completed fully and correctly.
@@ -248,11 +261,12 @@ LZ4F_errorCode_t LZ4F_freeDecompressionContext(LZ4F_dctx* dctx);
  
 </p></pre><BR>
 
-<pre><b>size_t LZ4F_getFrameInfo(LZ4F_dctx* dctx,
-                                     LZ4F_frameInfo_t* frameInfoPtr,
-                                     const void* srcBuffer, size_t* srcSizePtr);
+<pre><b>size_t
+LZ4F_getFrameInfo(LZ4F_dctx* dctx,
+                  LZ4F_frameInfo_t* frameInfoPtr,
+            const void* srcBuffer, size_t* srcSizePtr);
 </b><p>  This function extracts frame parameters (max blockSize, dictID, etc.).
-  Its usage is optional: user can call LZ4F_decompress() directly.
+  Its usage is optional: user can also invoke LZ4F_decompress() directly.
 
   Extracted information will fill an existing LZ4F_frameInfo_t structure.
   This can be useful for allocation and dictionary identification purposes.
@@ -295,10 +309,11 @@ LZ4F_errorCode_t LZ4F_freeDecompressionContext(LZ4F_dctx* dctx);
  
 </p></pre><BR>
 
-<pre><b>size_t LZ4F_decompress(LZ4F_dctx* dctx,
-                                   void* dstBuffer, size_t* dstSizePtr,
-                                   const void* srcBuffer, size_t* srcSizePtr,
-                                   const LZ4F_decompressOptions_t* dOptPtr);
+<pre><b>size_t
+LZ4F_decompress(LZ4F_dctx* dctx,
+                void* dstBuffer, size_t* dstSizePtr,
+          const void* srcBuffer, size_t* srcSizePtr,
+          const LZ4F_decompressOptions_t* dOptPtr);
 </b><p>  Call this function repetitively to regenerate data compressed in `srcBuffer`.
 
   The function requires a valid dctx state.
@@ -341,6 +356,30 @@ LZ4F_errorCode_t LZ4F_freeDecompressionContext(LZ4F_dctx* dctx);
 <pre><b>typedef enum { LZ4F_LIST_ERRORS(LZ4F_GENERATE_ENUM)
               _LZ4F_dummy_error_enum_for_c89_never_used } LZ4F_errorCodes;
 </b></pre><BR>
+<pre><b>LZ4FLIB_STATIC_API size_t LZ4F_getBlockSize(LZ4F_blockSizeID_t blockSizeID);
+</b><p>  Return, in scalar format (size_t),
+  the maximum block size associated with blockSizeID.
+</p></pre><BR>
+
+<pre><b>LZ4FLIB_STATIC_API size_t
+LZ4F_uncompressedUpdate(LZ4F_cctx* cctx,
+                        void* dstBuffer, size_t dstCapacity,
+                  const void* srcBuffer, size_t srcSize,
+                  const LZ4F_compressOptions_t* cOptPtr);
+</b><p>  LZ4F_uncompressedUpdate() can be called repetitively to add as much data uncompressed data as necessary.
+  Important rule: dstCapacity MUST be large enough to store the entire source buffer as
+  no compression is done for this operation
+  If this condition is not respected, LZ4F_uncompressedUpdate() will fail (result is an errorCode).
+  After an error, the state is left in a UB state, and must be re-initialized or freed.
+  If previously a compressed block was written, buffered data is flushed
+  before appending uncompressed data is continued.
+  This is only supported when LZ4F_blockIndependent is used
+ `cOptPtr` is optional : NULL can be provided, in which case all options are set to default.
+ @return : number of bytes written into `dstBuffer` (it can be zero, meaning input data was just buffered).
+           or an error code if it fails (which can be tested using LZ4F_isError())
+ 
+</p></pre><BR>
+
 <a name="Chapter11"></a><h2>Bulk processing dictionary API</h2><pre></pre>
 
 <pre><b>LZ4FLIB_STATIC_API LZ4F_CDict* LZ4F_createCDict(const void* dictBuffer, size_t dictSize);
@@ -351,12 +390,12 @@ LZ4FLIB_STATIC_API void        LZ4F_freeCDict(LZ4F_CDict* CDict);
  `dictBuffer` can be released after LZ4_CDict creation, since its content is copied within CDict 
 </p></pre><BR>
 
-<pre><b>LZ4FLIB_STATIC_API size_t LZ4F_compressFrame_usingCDict(
-    LZ4F_cctx* cctx,
-    void* dst, size_t dstCapacity,
-    const void* src, size_t srcSize,
-    const LZ4F_CDict* cdict,
-    const LZ4F_preferences_t* preferencesPtr);
+<pre><b>LZ4FLIB_STATIC_API size_t
+LZ4F_compressFrame_usingCDict(LZ4F_cctx* cctx,
+                              void* dst, size_t dstCapacity,
+                        const void* src, size_t srcSize,
+                        const LZ4F_CDict* cdict,
+                        const LZ4F_preferences_t* preferencesPtr);
 </b><p>  Compress an entire srcBuffer into a valid LZ4 frame using a digested Dictionary.
   cctx must point to a context created by LZ4F_createCompressionContext().
   If cdict==NULL, compress without a dictionary.
@@ -368,11 +407,11 @@ LZ4FLIB_STATIC_API void        LZ4F_freeCDict(LZ4F_CDict* CDict);
            or an error code if it fails (can be tested using LZ4F_isError()) 
 </p></pre><BR>
 
-<pre><b>LZ4FLIB_STATIC_API size_t LZ4F_compressBegin_usingCDict(
-    LZ4F_cctx* cctx,
-    void* dstBuffer, size_t dstCapacity,
-    const LZ4F_CDict* cdict,
-    const LZ4F_preferences_t* prefsPtr);
+<pre><b>LZ4FLIB_STATIC_API size_t
+LZ4F_compressBegin_usingCDict(LZ4F_cctx* cctx,
+                              void* dstBuffer, size_t dstCapacity,
+                        const LZ4F_CDict* cdict,
+                        const LZ4F_preferences_t* prefsPtr);
 </b><p>  Inits streaming dictionary compression, and writes the frame header into dstBuffer.
   dstCapacity must be >= LZ4F_HEADER_SIZE_MAX bytes.
  `prefsPtr` is optional : you may provide NULL as argument,
@@ -381,16 +420,36 @@ LZ4FLIB_STATIC_API void        LZ4F_freeCDict(LZ4F_CDict* CDict);
            or an error code (which can be tested using LZ4F_isError()) 
 </p></pre><BR>
 
-<pre><b>LZ4FLIB_STATIC_API size_t LZ4F_decompress_usingDict(
-    LZ4F_dctx* dctxPtr,
-    void* dstBuffer, size_t* dstSizePtr,
-    const void* srcBuffer, size_t* srcSizePtr,
-    const void* dict, size_t dictSize,
-    const LZ4F_decompressOptions_t* decompressOptionsPtr);
+<pre><b>LZ4FLIB_STATIC_API size_t
+LZ4F_decompress_usingDict(LZ4F_dctx* dctxPtr,
+                          void* dstBuffer, size_t* dstSizePtr,
+                    const void* srcBuffer, size_t* srcSizePtr,
+                    const void* dict, size_t dictSize,
+                    const LZ4F_decompressOptions_t* decompressOptionsPtr);
 </b><p>  Same as LZ4F_decompress(), using a predefined dictionary.
   Dictionary is used "in place", without any preprocessing.
   It must remain accessible throughout the entire frame decoding. 
 </p></pre><BR>
 
+<pre><b>typedef void* (*LZ4F_AllocFunction) (void* opaqueState, size_t size);
+typedef void* (*LZ4F_CallocFunction) (void* opaqueState, size_t size);
+typedef void  (*LZ4F_FreeFunction) (void* opaqueState, void* address);
+typedef struct {
+    LZ4F_AllocFunction customAlloc;
+    LZ4F_CallocFunction customCalloc; </b>/* optional; when not defined, uses customAlloc + memset */<b>
+    LZ4F_FreeFunction customFree;
+    void* opaqueState;
+} LZ4F_CustomMem;
+static
+#ifdef __GNUC__
+__attribute__((__unused__))
+#endif
+LZ4F_CustomMem const LZ4F_defaultCMem = { NULL, NULL, NULL, NULL };  </b>/**< this constant defers to stdlib's functions */<b>
+</b><p>  These prototypes make it possible to pass custom allocation/free functions.
+  LZ4F_customMem is provided at state creation time, using LZ4F_create*_advanced() listed below.
+  All allocation/free operations will be completed using these custom variants instead of regular <stdlib.h> ones.
+ 
+</p></pre><BR>
+
 </html>
 </body>