summaryrefslogtreecommitdiffstats
path: root/src/gui/painting/qdrawhelper_neon.cpp
Commit message (Collapse)AuthorAgeFilesLines
* Use NEON and preloading for 16 bit small / medium sized image blits.Samuel Rødal2010-09-011-0/+98
| | | | | | | | | | | | | | This gives a nice speedup for blitting of small and medium sized images by using preloading and avoiding function call overhead to memcpy for each scanline. For larger image widths memcpy becomes more efficient. Speedups of up to 40 % for 64 pixel wide images were measured. For image widths between 2 and 16 the speedup ranges between 12 % and 28 %. Task-number: QT-3401 Reviewed-by: Benjamin Poulain <benjamin.poulain@nokia.com>
* Implement qt_memfill32 with Neon.Benjamin Poulain2010-08-251-0/+38
| | | | | | | | | | | This patch introduce a implementation of qt_memfill32 with the Neon instructions set from ARMv7. The loop is unrolled 1 time to get better performance. This implementation of memfill is 330% faster on the N900. Reviewed-by: Samuel Rødal
* Implement the composition mode Plus with Neon.Benjamin Poulain2010-08-251-0/+55
| | | | | | | | | On the benchmark tst_QPainter::compositionModes(), this patches gives the following improvements: -300x300:opaque: 390% -300x300:!opaque: 1085% Reviewed-by: Samuel Rødal
* Improved performance of 16 bit memrotates using NEON instructions.Samuel Rødal2010-07-011-0/+144
| | | | | | | | | | | Make the memrotate functions a function pointer table so that we can replace it with optimized versions, and implement an optimized NEON version for the 90 and 270 rotations. Measured performance improvement for a 400x400 16-bit pixmap was 17 % for 270 degree rotation and 11 % for 90 degree rotation. Reviewed-by: Trond
* Add an implementation of comp_func_solid_SourceOver_neon() with Neon.Benjamin Poulain2010-06-231-0/+43
| | | | | | | | | | The function comp_func_solid_SourceOver_neon() is use extensively by WebKit via the calls to fillRect() of QPainter(). Implementing the function with Neon provides some performance improvement (around 175% of the previous speed). Reviewed-by: Samuel Rødal
* Optimized ARGB32PM on RGB16 blending with opacity using NEON.Samuel Rødal2010-03-261-5/+32
| | | | | | | | | | | | | | | | Use the blend_8_pixels_argb32_on_rgb16_neon function that was introduced in an earlier commit. Before: traces/qmlsamegame.trace, iterations: 3, frames: 15, min(ms): 63, median(ms): 64, stddev: 1,275776 %, max(fps): 238,095238 After: traces/qmlsamegame.trace, iterations: 3, frames: 15, min(ms): 57, median(ms): 58, stddev: 0,817464 %, max(fps): 263,157895 Task-number: QTBUG-6684 Reviewed-by: Gunnar Sletta
* Optimized SourceOver and 16 bit dest fetches, dest stores using NEON.Samuel Rødal2010-03-261-11/+114
| | | | | | | | This makes for example linear gradient blending on top of RGB16 156 % faster (from 20.4 fps to 52.3 fps in my benchmark). Task-number: QTBUG-6684 Reviewed-by: Gunnar Sletta
* Optimized scaled/transformed image blending for ARGB32PM and RGB16 on RGB16.Samuel Rødal2010-03-261-0/+139
| | | | | | | | | | | | | Before: :/traces/qmlphoneconcept.trace, iterations: 5, frames: 48, min(ms): 1207, median(ms): 1212, stddev: 0,165153 %, max(fps): 39,768020 After: traces/qmlphoneconcept.trace, iterations: 3, frames: 48, min(ms): 884, median(ms): 886, stddev: 0,383097 %, max(fps): 54,298643 Task-number: QTBUG-6684 Reviewed-by: Gunnar Sletta
* Included ARM NEON optimizations from pixman in Qt.Samuel Rødal2010-03-261-44/+100
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On the N900 16 bit text blending is 30 - 50 % faster, and ARGB32PM on RGB16 image blending now runs in 1/10th of the time it used to. We now make ARGB32PM the default pixmap format for alpha pixmaps instead of ARGB8565PM which is unaligned and bad for performance. The relevant numbers: Mostly opaque pixels: ARGB24 on ARGB24 using QPainter..................: 336,813033 ARGB32 on ARGB32 using QPainter.................: 18,419387 RGB16 on ARGB24 using QPainter..................: 167,301014 RGB16 on ARGB32 using QPainter..................: 17,279372 ARGB24 on RGB16 using QPainter..................: 35,100147 ARGB32PM on RGB16 using QPainter................: 15,924256 No opaque pixels: ARGB24 on ARGB24 using QPainter..................: 412,190765 ARGB32 on ARGB32 using QPainter.................: 16,818389 RGB16 on ARGB24 using QPainter..................: 170,957878 RGB16 on ARGB32 using QPainter..................: 16,742984 ARGB24 on RGB16 using QPainter..................: 93,600482 ARGB32PM on RGB16 using QPainter................: 15,999310 So switching to ARGB32PM should give a boost in all areas. Task-number: QTBUG-6684 Reviewed-by: Gunnar Sletta
* Fixed off-by-one blending errors in the NEON drawhelper code.Samuel Rødal2010-02-191-65/+65
| | | | | | | | For example blending alpha 0xff with alpha 0xff and a 0.5 opacity gave a result of 0xfe instead of the correct 0xff. This caused some autotests to fail on ARM/NEON. Reviewed-by: TrustMe
* Update copyright year to 2010Jason McDonald2010-01-061-1/+1
| | | | Reviewed-by: Trust Me
* NEON configure detection and initial blend function implementations.Samuel Rødal2009-12-181-0/+260
Adds new NEON configure test and -no-neon configure option. NEON implementations can also be turned off by setting the QT_NO_NEON environment variable. Performance improvements (in frames per second): - Blending ARGB32 on RGB32/ARGB32, mostly opaque: 71 % - Blending ARGB32 on RGB32/ARGB32, no opaque pixels: 108 % - Blending ARGB32 on RGB32/ARGB32, with 0.5 opacity: 158 % - Blending RGB32 on RGB32/ARGB32, with 0.5 opacity: 189 % Task-number: QTBUG-6684 Reviewed-by: Gunnar Sletta Reviewed-by: Paul Olav Tvete