[FFmpeg-devel] [PATCH] Moves yuv2yuvX_sse3 to yasm, unrolls main loop and other small optimizations for ~20% speedup.
Alan Kelly
alankelly at google.com
Thu Jan 7 11:39:56 EET 2021
Thanks for your patience with this, I have replaced mova with movdqu - movu
generated a compile error on ssse3. What system did this crash on?
On Wed, Jan 6, 2021 at 9:10 PM Michael Niedermayer <michael at niedermayer.cc>
wrote:
> On Tue, Jan 05, 2021 at 01:31:25PM +0100, Alan Kelly wrote:
> > Ping!
>
> crashes (due to alignment i think)
>
> (gdb) disassemble $rip-32,$rip+32
> Dump of assembler code from 0x5555555730a1 to 0x5555555730e1:
> 0x00005555555730a1 <ff_yuv2yuvX_avx2+161>: int $0x71
> 0x00005555555730a3 <ff_yuv2yuvX_avx2+163>: out %al,$0x3
> 0x00005555555730a5 <ff_yuv2yuvX_avx2+165>: vpsraw $0x3,%ymm1,%ymm1
> 0x00005555555730aa <ff_yuv2yuvX_avx2+170>: vpackuswb %ymm4,%ymm3,%ymm3
> 0x00005555555730ae <ff_yuv2yuvX_avx2+174>: vpackuswb %ymm1,%ymm6,%ymm6
> 0x00005555555730b2 <ff_yuv2yuvX_avx2+178>: mov (%rdi),%rdx
> 0x00005555555730b5 <ff_yuv2yuvX_avx2+181>: vpermq $0xd8,%ymm3,%ymm3
> 0x00005555555730bb <ff_yuv2yuvX_avx2+187>: vpermq $0xd8,%ymm6,%ymm6
> => 0x00005555555730c1 <ff_yuv2yuvX_avx2+193>: vmovdqa %ymm3,(%rcx,%rax,1)
> 0x00005555555730c6 <ff_yuv2yuvX_avx2+198>: vmovdqa
> %ymm6,0x20(%rcx,%rax,1)
> 0x00005555555730cc <ff_yuv2yuvX_avx2+204>: add $0x40,%rax
> 0x00005555555730d0 <ff_yuv2yuvX_avx2+208>: mov %rdi,%rsi
> 0x00005555555730d3 <ff_yuv2yuvX_avx2+211>: cmp %r8,%rax
> 0x00005555555730d6 <ff_yuv2yuvX_avx2+214>: jb 0x55555557304d
> <ff_yuv2yuvX_avx2+77>
> 0x00005555555730dc <ff_yuv2yuvX_avx2+220>: vzeroupper
> 0x00005555555730df <ff_yuv2yuvX_avx2+223>: retq
> 0x00005555555730e0 <yuv2rgb_c_48+0>: push %r15
> End of assembler dump.
> (gdb) info all-registers
> rax 0x0 0
> rbx 0x0 0
> rcx 0x55555583f470 93824995292272
>
>
> [...]
> --
> Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
>
> Modern terrorism, a quick summary: Need oil, start war with country that
> has oil, kill hundread thousand in war. Let country fall into chaos,
> be surprised about raise of fundamantalists. Drop more bombs, kill more
> people, be surprised about them taking revenge and drop even more bombs
> and strip your own citizens of their rights and freedoms. to be continued
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request at ffmpeg.org with subject "unsubscribe".
More information about the ffmpeg-devel
mailing list