[FFmpeg-devel] [PATCH] Add x86-optimized versions of lshift_tab().
Loren Merritt
lorenm
Sat Feb 12 06:58:57 CET 2011
>+%macro AC3_LSHIFT_INT16 1
>+cglobal ac3_lshift_int16_%1, 3,3,5, src, offset, lshift
>+ cmp lshiftd, 0
>+ je .end
>+ shl offsetq, 1
>+ sub offsetq, mmsize*4
>+%ifdef ARCH_X86_64
>+ ; move lshift value from register to stack, then to m0
>+ sub rsp, 4
>+ mov [rsp], lshiftd
>+ movd m0, [rsp]
>+ add rsp, 4
>+%else
>+ ; move lshift value directly from stack to m0
>+ movd m0, lshiftm
>+%endif
movd can take a gpr.
>+.loop
>+ mova m1, [srcq+offsetq ]
>+ mova m2, [srcq+offsetq+mmsize ]
>+ mova m3, [srcq+offsetq+mmsize*2]
>+ mova m4, [srcq+offsetq+mmsize*3]
>+ psllw m1, m0
>+ psllw m2, m0
>+ psllw m3, m0
>+ psllw m4, m0
>+ mova [srcq+offsetq ], m1
>+ mova [srcq+offsetq+mmsize ], m2
>+ mova [srcq+offsetq+mmsize*2], m3
>+ mova [srcq+offsetq+mmsize*3], m4
>+ sub offsetq, mmsize*4
>+ jge .loop
>+.end
Colon after label.
--Loren Merritt
More information about the ffmpeg-devel
mailing list