| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
| |
Scaling down, no rotation, with SSE2.
Specialize for ARGB32 not tiled, so the pixelbound can be simplified
Reviewed-by: Samuel
|
|
|
|
|
|
|
|
| |
when scale down, with no rotations
Process the pixel 4 by 4 and do the interpolation using SSE2
Reviewed-by: Samuel
|
|
|
|
|
|
|
|
| |
Another way to compute the interpolation that does less multiplications.
Small inpact on benchmark
Made-with: Samuel
|
|
|
|
|
|
|
|
|
|
|
| |
Windows 64 does not support MMX with MSVC. This is a problem
with the way SSE is currently used because it rely on previous
vector instructions being available.
This patches fixes that by using the intended functions for
SSE2 on Windows.
Merge-request: 792
Reviewed-by: Benjamin Poulain <benjamin.poulain@nokia.com>
|
|
|
|
|
|
| |
move the -1 out of the loop
Reviewed-by: Benjamin Poulain
|
|
|
|
|
|
|
| |
With the recent optimisation in fetchTransformedBilinear, the generic
path is faster than the 'optimized' path
Reviewed-by: Samuel
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This gives a nice speedup for blitting of small and medium sized
images by using preloading and avoiding function call overhead to
memcpy for each scanline. For larger image widths memcpy becomes more
efficient.
Speedups of up to 40 % for 64 pixel wide images were measured.
For image widths between 2 and 16 the speedup ranges between 12 %
and 28 %.
Task-number: QT-3401
Reviewed-by: Benjamin Poulain <benjamin.poulain@nokia.com>
|
|
|
|
|
|
|
|
|
|
| |
Qt does not build on PowerPC when compiling for both x86 and PPC on Mac.
The compiler is invoked only once for both architecture so the defines
are there in order to get the optimized path for x86. Those defines
needs to be removed from the compilation environment when the target
is set to PPC by GCC.
Reviewed-by: Kent Hansen
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
master to 4.7
This backport the following commits:
e55b6a3 qdrawhelper: remove code duplication
0d7e683 qdrawhelper: optimize fetchTransformedBilinear
29ef46e Fix compilation with RVCT
6601458 qdrawhelper: Use SSE2 in fetchTransformedBilinear (when scalling up)
398ef0ca Fix nasty copy-paste bug in fetchTransformedBilinear()
d585ece qdrawhelper: fix assert in fetchTransformedBilinear
Reviewed-by: Benjamin Poulain
|
|
|
|
|
|
|
|
|
| |
This reverts commit d2089600ea247b5d6354e8eee4becf40802c4693.
Reverting in order to avoid conflict in the master branch.
We are going to consider backporting the changes to Qt 4.7.1
Reviewed-by: Benjamin Poulain
|
|
|
|
|
|
|
|
|
|
|
| |
The function blend_transformed_bilinear_argb() was checking the blend type
at runtime for each pixel in order to clamp the coordinates. This
code was duplicated in both branch of the function.
This patch factorize the code by doing the clamping in a template
function.
Reviewed-by: Samuel Rødal
|
|
|
|
|
|
|
|
|
|
|
| |
This patch introduce a implementation of qt_memfill32 with the Neon
instructions set from ARMv7.
The loop is unrolled 1 time to get better performance.
This implementation of memfill is 330% faster on the N900.
Reviewed-by: Samuel Rødal
|
|
|
|
|
|
|
|
|
| |
On the benchmark tst_QPainter::compositionModes(), this patches gives
the following improvements:
-300x300:opaque: 390%
-300x300:!opaque: 1085%
Reviewed-by: Samuel Rødal
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
SSSE3 provides two tools to improve the blending speed over SSE2:
-palignr
-byte permutation
The alignement is enforced on src and dst with palignr to always make
aligned access.
The extraction of the alpha mask is done with a byte permutation in
order to save two instructions per cycle.
On Atom, this patch gives between 0% (aligned src) to 10% of
improvement (unaligned 4 and 12 bytes).
On Core 2, this patch gives consistently 8% to 10% of improvement
for every miss-alignment.
Reviewed-by: Samuel Rødal
|
|
|
|
|
|
| |
On Atom, comp_Source is 280% faster with the SSE2 implementation.
Reviewed-by: Andreas Kling
|
|
|
|
|
|
|
|
|
|
|
| |
Implement the composition function for CompositionMode_Plus with SSE2.
The macro MIX() can be replaced by a single instruction add-saturate,
which increase the speed a lot (13 times faster on the blend
benchmark).
Reviewed-by: Olivier Goffart
Reviewed-by: Andreas Kling
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some instructions sets were defining partial table of composition
functions. To work around that, qInitDrawhelperAsm() was resetting
the composition function of QPainter::CompositionMode_Destination and
anything above QPainter::CompositionMode_Xor.
This was a problem because it makes it impossible to implement fast
path for those composition mode.
This patch export prototypes for the generic functions of each
composition mode. The specialized implementations now define a complete
table.
Reviewed-by: Andreas Kling
|
|
|
|
|
|
|
| |
qt_memconvert's duff's device implementation assumes that count is > 0,
if count is 0 it will still blit eight pixels.
Reviewed-by: Trond
|
|
|
|
|
|
|
|
|
|
|
| |
Make the memrotate functions a function pointer table so that we can
replace it with optimized versions, and implement an optimized NEON
version for the 90 and 270 rotations.
Measured performance improvement for a 400x400 16-bit pixmap was
17 % for 270 degree rotation and 11 % for 90 degree rotation.
Reviewed-by: Trond
|
|
|
|
|
|
|
|
|
|
| |
The function comp_func_solid_SourceOver_neon() is use extensively by
WebKit via the calls to fillRect() of QPainter().
Implementing the function with Neon provides some performance
improvement (around 175% of the previous speed).
Reviewed-by: Samuel Rødal
|
|
|
|
|
|
|
|
| |
This function is used quite a lot by WebKit animations, the SSE2
implementation is twice as fast in those uses cases.
Reviewed-by: Andreas Kling
Reviewed-by: Samuel Rødal
|
|
|
|
|
|
|
|
|
|
|
|
| |
Implement a version of comp_func_SourceOver() with SSE2. This gives a
performance boost of 11% on some WebKit animations.
Two new macros were added to simplify the implementation of the
different blending primitives:
BLEND_SOURCE_OVER_ARGB32_SSE2() and
BLEND_SOURCE_OVER_ARGB32_WITH_CONST_ALPHA_SSE2()
Reviewed-by: Samuel Rødal
|
|
|
|
|
|
|
|
|
| |
When stretching a subrect of a pixmap we need to clamp the sampling
to the subrect. This was done for the ARGB32_Premultiplied target
format but not for the generic fallback. This patch adapts the
code so that the two code paths are equivalent.
Reviewed-by: Samuel
|
|
|
|
|
|
|
|
|
|
| |
The commit 2245641baa58125b57faf12496bc472491565498 was not correct
as the x2 (and y2) were not correct on the first line (the value was
1 instead of 0)
Fixes tst_QImage::smoothScale3
Reviewed-by: Samuel Rødal
|
|
|
|
|
| |
Reviewed-by: Eskil
Task: http://bugreports.qt.nokia.com/browse/QTBUG-7596
|
|
|
|
|
|
|
|
|
| |
Since we know that x1 and y1 are between the bounds, and than x2 and
y2 are bigger, we only need to check the upper bound for x2 and y2
This gives 5% speedup on a trace from the QML samegame demo.
Reviewed-by: Samuel Rødal
|
|
|
|
|
|
|
| |
And apply same optimisation as in 6a9e6f4780f5fc3aaf34f93c85de575f81c91e48
for duplicated code
Reviewed-by: Samuel Rødal
|
|
|
|
|
|
| |
3% faster on a trace from the qml samegame.
Reviewed-by: Samuel Rødal
|
|
|
|
|
|
|
|
| |
This makes for example linear gradient blending on top of RGB16
156 % faster (from 20.4 fps to 52.3 fps in my benchmark).
Task-number: QTBUG-6684
Reviewed-by: Gunnar Sletta
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Before:
:/traces/qmlphoneconcept.trace, iterations: 5, frames: 48, min(ms):
1207, median(ms): 1212, stddev: 0,165153 %, max(fps): 39,768020
After:
traces/qmlphoneconcept.trace, iterations: 3, frames: 48, min(ms): 884,
median(ms): 886, stddev: 0,383097 %, max(fps): 54,298643
Task-number: QTBUG-6684
Reviewed-by: Gunnar Sletta
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On the N900 16 bit text blending is 30 - 50 % faster, and ARGB32PM
on RGB16 image blending now runs in 1/10th of the time it used to.
We now make ARGB32PM the default pixmap format for alpha pixmaps instead
of ARGB8565PM which is unaligned and bad for performance.
The relevant numbers:
Mostly opaque pixels:
ARGB24 on ARGB24 using QPainter..................: 336,813033
ARGB32 on ARGB32 using QPainter.................: 18,419387
RGB16 on ARGB24 using QPainter..................: 167,301014
RGB16 on ARGB32 using QPainter..................: 17,279372
ARGB24 on RGB16 using QPainter..................: 35,100147
ARGB32PM on RGB16 using QPainter................: 15,924256
No opaque pixels:
ARGB24 on ARGB24 using QPainter..................: 412,190765
ARGB32 on ARGB32 using QPainter.................: 16,818389
RGB16 on ARGB24 using QPainter..................: 170,957878
RGB16 on ARGB32 using QPainter..................: 16,742984
ARGB24 on RGB16 using QPainter..................: 93,600482
ARGB32PM on RGB16 using QPainter................: 15,999310
So switching to ARGB32PM should give a boost in all areas.
Task-number: QTBUG-6684
Reviewed-by: Gunnar Sletta
|
| |
|
|\
| |
| |
| |
| |
| | |
Conflicts:
src/s60installs/bwins/QtCoreu.def
src/s60installs/eabi/QtCoreu.def
|
| |
| |
| |
| |
| |
| |
| |
| | |
We were using gamma corrected 11 bit values instead of the 8 bit non-
corrected values, which caused some strange rendering effects.
Task-number: QTBUG-9036
Reviewed-by: Samuel
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The generic conversion is too slow for conversions to
RGB565, which are common on embedded platforms (e.g.: Maemo).
This patch enable the fast path for all conversion to rgb_565,
it is a follow-up of 7d7a85fa16b28fdba257bb466be5a6d2b4bf5d2f
Reviewed-by: Tom Cooksey
|
| |
| |
| |
| | |
Reviewed-by: Benjamin Poulain
|
|\ \
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
4.7-integration
* '4.7' of scm.dev.nokia.troll.no:qt/oslo-staging-1: (63 commits)
doc: Fixed some qdoc errors.
Setting ImhHiddenText for NoEcho line edits is not 100% correct, but still way better than fully visible text.
Allow building documentation without all of Qt
Added a documentation for the new enum value in gesture api.
Remove the OBJECTS_DIR variable assignment from some projets in Qt.
Fix compile
qmake/MinGw: Link statically for Qt Creator to be able to detect it.
Enable two fast path for blend_tiled_rgb565
Avoid QString reallocation for smallcaps fonts in Itemizer::generate()
Make QLabel::text a reloadable property
remove non wifi interfaces from being handled.
Disable auto-uppercasing and predictive text for password line edits.
Avoid QString reallocation in QTextEngine::itemize()
Remove the Qt 4.7 #if guards that were needed for 4.6
Always redraw the complete control when an input event comes in.
Make sure not to crash if createStandardContextMenu() returns 0 (e.g. on Maemo5)
Fix compilation: include QString in order to use QString.
Fix compile
Block the Maemo5 window attribute values from being assigned to something else on other platforms.
be more verbose when warning about incompatible libraries
...
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Blending ARGB8565 and RGB16 on top of RGB16 is common
on system with 16 bits color depth. The faster
blending functions can be used instead of blend_tiled_generic.
Reviewed-by: Tom Cooksey
|
|/ / |
|
|\ \
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
master-integration
* 'master' of scm.dev.nokia.troll.no:qt/oslo-staging-1: (35 commits)
Doc: Added a config file for creating Simplified Chinese docs directly.
Doc: add a few lines about bearer managment to "What's New" page.
Fix the SIMD implementations of QString::toLatin1()
Update of the QScriptValue autotest suite.
New data set for QScriptValue autotest generator.
Autotest: make tst_qchar run out-of-source too
Autotest: add a test for roundtrips through toLatin1/fromLatin1
Implement toLatin1_helper with Neon
QRegExp::pos() should return -1 for empty/non-matching captures
Revert "qdoc: Finished "Inherited by" list for QML elements."
Revert "qdoc: List new QML elements in \sincelist for What's New page."
Add the Unicode normalisation properties.
Autotest: add a test for QDBusPendingCallWatcher use in threads
Doc: placeholders for new feature highlights.
doc: mark as reimplemented.
Update of the QScriptValue autotest suite.
New autotests cases for QScriptValue autotests generator.
QScriptValue autotest generator templates change.
Fix license template.
QScriptValue::isQMetaObject crash fix.
...
|
| |\ \
| | |/ |
|
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Move the caching of the result from drawhelper to qsimd.cpp.
Avoid getting the environment variables when not necessary
Reviewed-by: Samuel Rødal
|
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The SIMD instructions are useful outside painting code,
the common functions are moved to QtCore
Reviewed-by: Samuel Rødal
|
| |/
|/|
| |
| | |
Reviewed-by: ogoffart
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
When available, use SSE2 to blend images, computation is
done on 4 pixels at the time instead of 1 with MMX.
Performance improvements:
- Blending ARGB32 on RGB32/ARGB32, mostly opaque: 188%
- Blending ARGB32 on RGB32/ARGB32, no opaque pixels: 180%
- Blending ARGB32 on RGB32/ARGB32, with 0.5 opacity: 187%
- Blending RGB32 on RGB32/ARGB32, with 0.5 opacity: 206%
Reviewed-by: Samuel Rødal
|
| |
| |
| |
| |
| |
| | |
Put the common code together with a #define.
Remove the check for the length from comp_func_Clear_impl and
move it to qt_memfill()
|
|/
|
|
|
|
|
|
|
|
|
| |
When compiling with -fpu=softvfp+vfpv2 on Tb9.2 we were getting the
segmentattion fault.
This seems to be due to the RVCT bug when it comes to using int->float
(and reverse) castings. One extra level of indirection (function
call) has to be applied.
Task-number: QTBUG-4893
Reviewed-by: TrustMe
|
|\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
4.6-integration
* '4.6' of scm.dev.nokia.troll.no:qt/oslo-staging-1:
QIODevice: Fix readAll()
Temporary hackiesh solution to prevent BOM in the xml data.
Fixed qxmlstream autotest when using shadow builds.
Attempt at readding the capital P headers for Phonon
Remove special Phonon processing from syncqt.
Use the lowercase/shortname.h headers for Phonon includes
Fixes a crash when setting focus on a widget with a focus proxy.
Update copyright year to 2010
doc: Clarified activeSubControls and subControls.
Remove warning "statement with no effect"
doc: Clarified that .lnk files are System files on Windows.
|
| |
| |
| |
| | |
Reviewed-by: Trust Me
|
|/
|
|
|
|
|
| |
We should check for the fully opaque and fully transparent special
cases, like we do in the dedicated image blend functions.
Reveiewed-by: Gunnar Sletta
|