diff options
author | William Joye <wjoye@cfa.harvard.edu> | 2016-10-25 20:57:49 (GMT) |
---|---|---|
committer | William Joye <wjoye@cfa.harvard.edu> | 2016-10-25 20:57:49 (GMT) |
commit | d1c4bf158203c4e8ec29fdeb83fd311e36320885 (patch) | |
tree | 15874534e282f67505ce4af5ba805a1ff70ec43e /funtools/util/zlib-1.2.3/contrib/asm586/README.586 | |
parent | e19a18e035dc4d0e8e215f9b452bb9ef6f58b9d7 (diff) | |
parent | 339420dd5dd874c41f6bab5808291fb4036dd022 (diff) | |
download | blt-d1c4bf158203c4e8ec29fdeb83fd311e36320885.zip blt-d1c4bf158203c4e8ec29fdeb83fd311e36320885.tar.gz blt-d1c4bf158203c4e8ec29fdeb83fd311e36320885.tar.bz2 |
Merge commit '339420dd5dd874c41f6bab5808291fb4036dd022' as 'funtools'
Diffstat (limited to 'funtools/util/zlib-1.2.3/contrib/asm586/README.586')
-rw-r--r-- | funtools/util/zlib-1.2.3/contrib/asm586/README.586 | 43 |
1 files changed, 43 insertions, 0 deletions
diff --git a/funtools/util/zlib-1.2.3/contrib/asm586/README.586 b/funtools/util/zlib-1.2.3/contrib/asm586/README.586 new file mode 100644 index 0000000..6bb78f3 --- /dev/null +++ b/funtools/util/zlib-1.2.3/contrib/asm586/README.586 @@ -0,0 +1,43 @@ +This is a patched version of zlib modified to use +Pentium-optimized assembly code in the deflation algorithm. The files +changed/added by this patch are: + +README.586 +match.S + +The effectiveness of these modifications is a bit marginal, as the the +program's bottleneck seems to be mostly L1-cache contention, for which +there is no real way to work around without rewriting the basic +algorithm. The speedup on average is around 5-10% (which is generally +less than the amount of variance between subsequent executions). +However, when used at level 9 compression, the cache contention can +drop enough for the assembly version to achieve 10-20% speedup (and +sometimes more, depending on the amount of overall redundancy in the +files). Even here, though, cache contention can still be the limiting +factor, depending on the nature of the program using the zlib library. +This may also mean that better improvements will be seen on a Pentium +with MMX, which suffers much less from L1-cache contention, but I have +not yet verified this. + +Note that this code has been tailored for the Pentium in particular, +and will not perform well on the Pentium Pro (due to the use of a +partial register in the inner loop). + +If you are using an assembler other than GNU as, you will have to +translate match.S to use your assembler's syntax. (Have fun.) + +Brian Raiter +breadbox@muppetlabs.com +April, 1998 + + +Added for zlib 1.1.3: + +The patches come from +http://www.muppetlabs.com/~breadbox/software/assembly.html + +To compile zlib with this asm file, copy match.S to the zlib directory +then do: + +CFLAGS="-O3 -DASMV" ./configure +make OBJA=match.o |