summaryrefslogtreecommitdiffstats
path: root/lib/lz4opt.h
Commit message (Collapse)AuthorAgeFilesLines
* merge lz4opt.h into lz4hc.cYann Collet2018-02-251-356/+0
| | | | | | | | | | | | | | Having a dedicated file for optimal parser made sense during its creation, it allowed Przemyslaw to work more freely on lz4opt, with less dependency on lz4hc, moreover, the optimal parser was more complex, with its own search functions. Since the optimal was rewritten last year, it's now a lot lighter. It makes more sense now to integrate it directly inside lz4hc.c, making it easier to edit (editors are a bit "lost" inside a `*.h` dependent on its #include position), it also reduces the number of files in the project, which fits pretty well with lz4 objectives. (adding lz4hc requires "just" lz4hc.h and lz4hc.c).
* edge case : compress up to end-mflimit (12 bytes)Yann Collet2018-02-241-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The LZ4 block format specification states that the last match must start at a minimum distance of 12 bytes from the end of the block. However, out of an abundance of caution, the reference implementation would actually stop searching matches at 13 bytes from the end of the block. This patch fixes this small detail. The new version is now able to properly compress a limit case such as `aaaaaaaabaaa\n` as reported by Gao Xiang (@hsiangkao). Obviously, it doesn't change a lot of things. This is just one additional match candidate per block, with a maximum match length of 7 (since last 5 bytes must remain literals). With default policy, blocks are 4 MB long, so it doesn't happen too often Compressing silesia.tar at default level 1 saves 5 bytes (100930101 -> 100930096). At max level 12, it saves a grand 16 bytes (77389871 -> 77389855). The impact is a bit more visible when blocks are smaller, hence more numerous. For example, compressing silesia with blocks of 64 KB (using -12 -B4D) saves 543 bytes (77304583 -> 77304040). So the smaller the packet size, the more visible the impact. And it happens we have a ton of scenarios with little blocks using LZ4 compression ... And a useless "hooray" sidenote : the patch improves the LZ4 compression record of silesia (using -12 -B7D --no-frame-crc) by 16 bytes (77270672 -> 77270656) and the record on enwik9 by 44 bytes (371680396 -> 371680352) (previously claimed by [smallz4](http://create.stephan-brumme.com/smallz4/) ).
* Merge pull request #434 from lz4/patternYann Collet2018-01-061-1/+3
|\ | | | | conditional pattern analysis
| * conditional pattern analysisYann Collet2017-12-221-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pattern analysis (currently limited to long ranges of identical bytes) is actually detrimental to performance when `nbSearches` is low. Reason is : `nbSearches` provides a built-in protection for these cases. The problem with patterns is that they dramatically increase the number of candidates to visit. But with a low nbSearches, the match finder just aborts early. In such cases, pattern analysis adds some complexity without reducing total nb of candidates. It actually increases compression ratio a little bit, by filtering only "good" candidates, but at a measurable speed cost, so it's not a good trade-off. This patch makes pattern analysis optional. It's enabled for levels 8+ only.
* | lz4opt supports _destSizeYann Collet2017-12-221-18/+43
|/ | | | no longer limited to level 9
* added code commentsYann Collet2017-11-091-1/+6
|
* added constant TRAILING_LITERALSYann Collet2017-11-091-5/+6
| | | | | which is more explicit than its value `3`. reported by @terrelln
* lz4opt: simplified match finder invocation to LZ4HC_FindLongerMatch()Yann Collet2017-11-091-20/+11
|
* removed the ip++ at the beginning of blockYann Collet2017-11-081-1/+0
| | | | | | | | | | | | | The first byte used to be skipped to avoid a infinite self-comparison. This is no longer necessary, since init() ensures that index starts at 64K. The first byte is also useless to search when each block is independent, but it's no longer the case when blocks are linked. Removing the first-byte-skip saves about 10 bytes / MB on files compressed with -BD4 (linked blocks 64Kb), which feels correct as each MB has 16 blocks of 64KB.
* minor comment editYann Collet2017-11-031-7/+6
|
* moved ctx->end handling from parsersYann Collet2017-11-031-1/+0
| | | | responsibility better handled one layer above (LZ4HC_compress_generic())
* removed ctx->searchNumYann Collet2017-11-031-6/+8
| | | | | nbSearches now transmitted directly as function parameter easier to track and debug
* LZ4_compress_HC_continue_destSize() now compatible with optimal parserYann Collet2017-11-031-5/+5
| | | | levels 11+
* removes matches[] tableYann Collet2017-11-031-73/+67
| | | | | saves stack space clearer match finder interface (no more table to fill)
* removed useless parameter from hash chain matchfinderYann Collet2017-11-031-5/+4
| | | | used to be present for compatibility with binary tree matchfinder
* removed code and reference to binary tree match finderYann Collet2017-11-031-122/+2
| | | | reduced size of LZ4HC state
* improved level 11 speedYann Collet2017-11-031-2/+4
|
* optimized skip strategy for level 12Yann Collet2017-11-031-3/+6
|
* more generic skip formulaYann Collet2017-11-031-13/+4
| | | | improving speed
* small adaptations for intermediate level 11Yann Collet2017-11-021-6/+5
|
* partial search, while preserving compression ratioYann Collet2017-11-021-0/+14
| | | | tag interesting places
* searching match leading strictly farther does not workYann Collet2017-11-021-1/+1
| | | | | sometimes, it's better to re-use same match but start it later, in order to get shorter matchlength code
* fixed last lost bytes in maximal modeYann Collet2017-11-021-3/+4
| | | | | even gained 2 bytes on calgary.tar... added conditional traces `g_debuglog_enable`
* changed strategy : opt[] path is complete after each matchYann Collet2017-11-021-33/+57
| | | | | | | previous strategy would leave a few "bad choices" on the ground they would be fixed later, but that requires passing through each position to make the fix and cannot give the end position of the last useful match.
* fixed minor overflow mistake in optimal parserYann Collet2017-10-311-1/+5
| | | | saving 20 bytes on calgary.tar
* fixed minor initialization warningYann Collet2017-10-301-1/+1
|
* added hash chain with conditional lengthYann Collet2017-10-251-1/+2
| | | | not a success yet
* lz4opt: added hash chain searchYann Collet2017-10-211-14/+44
|
* switched many types to intYann Collet2017-10-201-38/+37
|
* removed SET_PRICE macroYann Collet2017-10-201-17/+14
|
* removed one macro usageYann Collet2017-10-201-4/+11
|
* minor refactorYann Collet2017-10-201-28/+35
| | | | | reduce variable scope remove one macro usage
* lz4opt: refactor sequence reverse traversalYann Collet2017-10-201-10/+20
|
* refactor variable matchnumYann Collet2017-10-201-14/+14
| | | | | separate initial and iterative search renamed nb_matches
* simplified initial cost conditionsYann Collet2017-10-201-10/+15
| | | | llen integrated in opt[]
* added assertYann Collet2017-10-191-1/+1
|
* renamed last_pos into last_match_posYann Collet2017-10-191-15/+15
|
* simplified early exit when single solutionYann Collet2017-10-191-5/+5
|
* FIX: added prefix to FORCE_INLINE to prevent redefinition error during ↵tcpan2017-08-241-5/+5
| | | | compilation when used with other libraries that define FORCE_INLINE
* fix #369Yann Collet2017-06-261-0/+5
| | | | | | | | | | | | | | The bug would make the bt search read one byte in an invalid memory region, and make a branch decision based on its value. Impact was small (missed compression opportunity). It only happens in -BD mode, with extDict-prefix overlapping matches. The bt match search is supposed to work also in extDict mode. In which case, the match ptr can point into Dict. When the match was overlapping Dict<->Prefix, match[matchLength] would end up outside of Dict, in an invalid memory area. The correction ensures that in such a case, match[matchLength] ends up at intended location, inside prefix.
* changed macro HEAPMODE into LZ4_HEAPMODEYann Collet2017-05-021-6/+7
| | | | | | | This macro is susceptible to be triggered from user side typically through compiler flag (-DLZ4_HEAPMODE=1). In which case, it makes sense to prefix the macro since we want to reduce potential side-effect on namespace.
* Merge branch 'optlz4opt' of github.com:Cyan4973/lz4 into optlz4optYann Collet2017-03-201-1/+0
|\
| * slight btopt speed improvementYann Collet2017-03-181-2/+2
| | | | | | | | removing a useless test
* | minor refactorYann Collet2017-03-201-72/+71
| |
* | slight btopt speed improvementYann Collet2017-03-201-3/+4
|/ | | | removing a useless test
* made SET_PRICE macro more usableYann Collet2017-03-181-4/+4
| | | | | previous version would use argument to also change target member. Now, only values are transferred
* improved lz4opt speed (~4%)Yann Collet2017-03-171-12/+12
|
* minor price function optimizationYann Collet2017-03-171-8/+6
|
* LZ4_compress_HC_destSize() uses LZ4HC_compress_generic() code pathYann Collet2017-03-161-1/+1
| | | | | Limits compression level to 10, to remain compatible with Hash Chain.
* removed nextToUpdateBTPrzemyslaw Skibinski2016-12-281-3/+3
|