[FFmpeg-devel] [PATCH] x86/dsputil: fix VECTOR_CLIP_INT32 macro

Michael Niedermayer michaelni at gmx.at
Fri May 23 23:05:46 CEST 2014


On Fri, May 23, 2014 at 04:05:43AM -0300, James Almer wrote:
> The inline loop was incrementing and using the value of %%i
> the wrong way.
> 
> Disassembly of ff_vector_clip_int32_sse2 before and after
> this patch:
> 
>     movdqa (%rdx),%xmm0      |  movdqa (%rdx),%xmm0
>     movdqa 0x10(%rdx),%xmm1  |  movdqa 0x10(%rdx),%xmm1
>     movdqa 0x20(%rdx),%xmm2  |  movdqa 0x20(%rdx),%xmm2
>     movdqa 0x30(%rdx),%xmm3  |  movdqa 0x30(%rdx),%xmm3
> [...]                        |
>     movdqa %xmm0,(%rcx)      |  movdqa %xmm0,(%rcx)
>     movdqa %xmm1,0x10(%rcx)  |  movdqa %xmm1,0x10(%rcx)
>     movdqa %xmm2,0x20(%rcx)  |  movdqa %xmm2,0x20(%rcx)
>     movdqa %xmm3,0x30(%rcx)  |  movdqa %xmm3,0x30(%rcx)
>     movdqa (%rdx),%xmm0      |  movdqa 0x40(%rdx),%xmm0
>     movdqa 0x20(%rdx),%xmm1  |  movdqa 0x50(%rdx),%xmm1
>     movdqa 0x40(%rdx),%xmm2  |  movdqa 0x60(%rdx),%xmm2
>     movdqa 0x60(%rdx),%xmm3  |  movdqa 0x70(%rdx),%xmm3
> [...]                        |
>     movdqa %xmm0,(%rcx)      |  movdqa %xmm0,0x40(%rcx)
>     movdqa %xmm1,0x20(%rcx)  |  movdqa %xmm1,0x50(%rcx)
>     movdqa %xmm2,0x40(%rcx)  |  movdqa %xmm2,0x60(%rcx)
>     movdqa %xmm3,0x60(%rcx)  |  movdqa %xmm3,0x70(%rcx)
>     add    $0x80,%rdx        |  add    $0x80,%rdx
>     add    $0x80,%rcx        |  add    $0x80,%rcx
> 
> Other versions were unaffected.
> 
> Signed-off-by: James Almer <jamrial at gmail.com>
> ---
>  libavcodec/x86/dsputil.asm | 36 ++++++++++++++++++------------------
>  1 file changed, 18 insertions(+), 18 deletions(-)

applied

thanks

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Many things microsoft did are stupid, but not doing something just because
microsoft did it is even more stupid. If everything ms did were stupid they
would be bankrupt already.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 181 bytes
Desc: Digital signature
URL: <https://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20140523/0d36decc/attachment.asc>


More information about the ffmpeg-devel mailing list