summaryrefslogtreecommitdiffstats
path: root/funtools/util/zlib-1.2.3/contrib/asm586/README.586
diff options
context:
space:
mode:
authorWilliam Joye <wjoye@cfa.harvard.edu>2016-10-25 20:57:49 (GMT)
committerWilliam Joye <wjoye@cfa.harvard.edu>2016-10-25 20:57:49 (GMT)
commitd1c4bf158203c4e8ec29fdeb83fd311e36320885 (patch)
tree15874534e282f67505ce4af5ba805a1ff70ec43e /funtools/util/zlib-1.2.3/contrib/asm586/README.586
parente19a18e035dc4d0e8e215f9b452bb9ef6f58b9d7 (diff)
parent339420dd5dd874c41f6bab5808291fb4036dd022 (diff)
downloadblt-d1c4bf158203c4e8ec29fdeb83fd311e36320885.zip
blt-d1c4bf158203c4e8ec29fdeb83fd311e36320885.tar.gz
blt-d1c4bf158203c4e8ec29fdeb83fd311e36320885.tar.bz2
Merge commit '339420dd5dd874c41f6bab5808291fb4036dd022' as 'funtools'
Diffstat (limited to 'funtools/util/zlib-1.2.3/contrib/asm586/README.586')
-rw-r--r--funtools/util/zlib-1.2.3/contrib/asm586/README.58643
1 files changed, 43 insertions, 0 deletions
diff --git a/funtools/util/zlib-1.2.3/contrib/asm586/README.586 b/funtools/util/zlib-1.2.3/contrib/asm586/README.586
new file mode 100644
index 0000000..6bb78f3
--- /dev/null
+++ b/funtools/util/zlib-1.2.3/contrib/asm586/README.586
@@ -0,0 +1,43 @@
+This is a patched version of zlib modified to use
+Pentium-optimized assembly code in the deflation algorithm. The files
+changed/added by this patch are:
+
+README.586
+match.S
+
+The effectiveness of these modifications is a bit marginal, as the the
+program's bottleneck seems to be mostly L1-cache contention, for which
+there is no real way to work around without rewriting the basic
+algorithm. The speedup on average is around 5-10% (which is generally
+less than the amount of variance between subsequent executions).
+However, when used at level 9 compression, the cache contention can
+drop enough for the assembly version to achieve 10-20% speedup (and
+sometimes more, depending on the amount of overall redundancy in the
+files). Even here, though, cache contention can still be the limiting
+factor, depending on the nature of the program using the zlib library.
+This may also mean that better improvements will be seen on a Pentium
+with MMX, which suffers much less from L1-cache contention, but I have
+not yet verified this.
+
+Note that this code has been tailored for the Pentium in particular,
+and will not perform well on the Pentium Pro (due to the use of a
+partial register in the inner loop).
+
+If you are using an assembler other than GNU as, you will have to
+translate match.S to use your assembler's syntax. (Have fun.)
+
+Brian Raiter
+breadbox@muppetlabs.com
+April, 1998
+
+
+Added for zlib 1.1.3:
+
+The patches come from
+http://www.muppetlabs.com/~breadbox/software/assembly.html
+
+To compile zlib with this asm file, copy match.S to the zlib directory
+then do:
+
+CFLAGS="-O3 -DASMV" ./configure
+make OBJA=match.o