[FFmpeg-devel] [PATCH 1/6] x86: huffyuvdsp: port mmx add_bytes to yasm
Michael Niedermayer
michaelni at gmx.at
Thu May 29 14:36:37 CEST 2014
On Thu, May 29, 2014 at 09:10:35AM +0000, Christophe Gisquet wrote:
> 68c to 56c.
> ---
> libavcodec/x86/huffyuvdsp.asm | 32 ++++++++++++++++++++++++++++++++
> libavcodec/x86/huffyuvdsp_init.c | 2 +-
> libavcodec/x86/huffyuvdsp_mmx.c | 32 +-------------------------------
> 3 files changed, 34 insertions(+), 32 deletions(-)
>
> diff --git a/libavcodec/x86/huffyuvdsp.asm b/libavcodec/x86/huffyuvdsp.asm
> index f183ebe..7acab87 100644
> --- a/libavcodec/x86/huffyuvdsp.asm
> +++ b/libavcodec/x86/huffyuvdsp.asm
> @@ -163,3 +163,35 @@ cglobal add_hfyu_left_pred, 3,3,7, dst, src, w, left
> ADD_HFYU_LEFT_LOOP 0, 1
> .src_unaligned:
> ADD_HFYU_LEFT_LOOP 0, 0
> +
> +INIT_MMX mmx
> +cglobal add_bytes, 3,4,4, dst, src, w, size
> + mov sizeq, wq
w is int/32bit this can leave trash in the high 32bit
> + and sizeq, -2*mmsize
same
> + jz .2
> + add dstq, sizeq
> + add srcq, sizeq
> + neg sizeq
> +.1:
> + movu m0, [dstq + sizeq]
> + movu m1, [srcq + sizeq]
> + movu m2, [dstq + sizeq + mmsize]
> + movu m3, [srcq + sizeq + mmsize]
these should be mova, so in case this gets extended to SSE* it
doesnt end up with unaligned slow movs
> + paddb m1, m0
> + paddb m3, m2
> + movu [dstq + sizeq], m1
> + movu [dstq + sizeq + mmsize], m3
these too
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
While the State exists there can be no freedom; when there is freedom there
will be no State. -- Vladimir Lenin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 181 bytes
Desc: Digital signature
URL: <https://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20140529/f8903085/attachment.asc>
More information about the ffmpeg-devel
mailing list