| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
| |
Aligned load() and store() have been shown to be faster for the
composition functions. This patch applies this to the solid
SourceOver function.
Reviewed-by: Samuel Rødal
|
|
|
|
|
|
|
|
|
|
|
| |
Writing the data of memfill() to a cacheline is unecessary because the
data is not reused directly. We can use the stream operations to avoid
the cache completely.
When testing memfill32 separately, the function is twice as fast
on Core2 and Atom.
Reviewed-by: Samuel Rødal
|
|
|
|
|
|
|
|
|
|
|
| |
Windows 64 does not support MMX with MSVC. This is a problem
with the way SSE is currently used because it rely on previous
vector instructions being available.
This patches fixes that by using the intended functions for
SSE2 on Windows.
Merge-request: 792
Reviewed-by: Benjamin Poulain <benjamin.poulain@nokia.com>
|
|
|
|
|
|
|
|
|
|
| |
Qt does not build on PowerPC when compiling for both x86 and PPC on Mac.
The compiler is invoked only once for both architecture so the defines
are there in order to get the optimized path for x86. Those defines
needs to be removed from the compilation environment when the target
is set to PPC by GCC.
Reviewed-by: Kent Hansen
|
|
|
|
|
|
|
|
|
| |
On the benchmark tst_QPainter::compositionModes(), this patches gives
the following improvements:
-300x300:opaque: 390%
-300x300:!opaque: 1085%
Reviewed-by: Samuel Rødal
|
|
|
|
|
|
|
| |
Replace the code of the SSE prologue by a macro to avoid copying the
prologue everywhere.
Reviewed-by: Andreas Kling
|
|
|
|
|
|
| |
On Atom, comp_Source is 280% faster with the SSE2 implementation.
Reviewed-by: Andreas Kling
|
|
|
|
|
|
|
| |
The assert can never be true since const_alpha is unsigned. That
line was triggering a warning on GCC.
Reviewed-by: Olivier Goffart
|
|
|
|
|
|
|
|
|
|
|
| |
Implement the composition function for CompositionMode_Plus with SSE2.
The macro MIX() can be replaced by a single instruction add-saturate,
which increase the speed a lot (13 times faster on the blend
benchmark).
Reviewed-by: Olivier Goffart
Reviewed-by: Andreas Kling
|
|
|
|
|
|
|
|
| |
Aligned load are faster than unaligned load. This patch add a prologue
to the blending function in order to align the destination on 16 bytes
before using SSE2.
Reviewed-by: Kent Hansen
|
|
|
|
|
| |
Merge-request: 725
Reviewed-by: Benjamin Poulain <benjamin.poulain@nokia.com>
|
|
|
|
|
| |
Merge-request: 725
Reviewed-by: Benjamin Poulain <benjamin.poulain@nokia.com>
|
|
|
|
|
|
|
|
|
| |
Const_alpha == 0 is a corner case that can happen if the painter draw
with zero opacity or if the multiplication of alphas is below 1.
The assertion was failing for one of the test of QPainter.
Reviewed-by: Samuel Rødal
|
|
|
|
|
|
| |
Swapping the length and color arguments to qt_memfill is a bad idea.
Reviewed-by: Benjamin Poulain
|
|
|
|
|
|
|
|
| |
I did erroneous cast by mistake, the code should ensure the pointer
are on 32 bits integers.
Reviewed-by: Andreas Kling
Reviewed-by: Samuel Rødal
|
|
|
|
|
|
|
|
| |
This function is used quite a lot by WebKit animations, the SSE2
implementation is twice as fast in those uses cases.
Reviewed-by: Andreas Kling
Reviewed-by: Samuel Rødal
|
|
|
|
|
|
|
|
|
|
|
|
| |
Implement a version of comp_func_SourceOver() with SSE2. This gives a
performance boost of 11% on some WebKit animations.
Two new macros were added to simplify the implementation of the
different blending primitives:
BLEND_SOURCE_OVER_ARGB32_SSE2() and
BLEND_SOURCE_OVER_ARGB32_WITH_CONST_ALPHA_SSE2()
Reviewed-by: Samuel Rødal
|
|
|
|
|
|
| |
Some compilers do not inline the functions, which is a problem
because the number of arguments exceed the limit for SSE,
and because it is a lot slower for those low level functions.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When available, use SSE2 to blend images, computation is
done on 4 pixels at the time instead of 1 with MMX.
Performance improvements:
- Blending ARGB32 on RGB32/ARGB32, mostly opaque: 188%
- Blending ARGB32 on RGB32/ARGB32, no opaque pixels: 180%
- Blending ARGB32 on RGB32/ARGB32, with 0.5 opacity: 187%
- Blending RGB32 on RGB32/ARGB32, with 0.5 opacity: 206%
Reviewed-by: Samuel Rødal
|
|
|
|
| |
Reviewed-by: Trust Me
|
|
|
|
| |
Reviewed-by: Trust Me
|
| |
|
|
|
|
| |
Reviewed-by: Trust Me
|
|
|
|
| |
Reviewed-by: Trust Me
|
|
|
|
| |
Reviewed-by: Trust Me
|
|
|