Commit message (Collapse) | Author | Age | Files | Lines | |
---|---|---|---|---|---|
* | Fix the blending of ARGB_PM image when using palignr to load the data | Benjamin Poulain | 2010-08-17 | 1 | -15/+5 |
| | | | | | | | | | | The data loaded for the first were incorrect because the offset was incorrect. The correct offset should be up to the alignment point. Instead of trying to load a temporary array, we just move one vector further since we know reading there is always safe. Reviewed-by: Andreas Kling | ||||
* | Implement the general blending of ARGB32_pm with SSSE3 | Benjamin Poulain | 2010-08-16 | 1 | -0/+263 |
SSSE3 provides two tools to improve the blending speed over SSE2: -palignr -byte permutation The alignement is enforced on src and dst with palignr to always make aligned access. The extraction of the alpha mask is done with a byte permutation in order to save two instructions per cycle. On Atom, this patch gives between 0% (aligned src) to 10% of improvement (unaligned 4 and 12 bytes). On Core 2, this patch gives consistently 8% to 10% of improvement for every miss-alignment. Reviewed-by: Samuel Rødal |