[FFmpeg-devel] [PATCH] Add x86-optimized versions of lshift_tab().

Loren Merritt lorenm
Sat Feb 12 06:58:57 CET 2011


>+%macro AC3_LSHIFT_INT16 1
>+cglobal ac3_lshift_int16_%1, 3,3,5, src, offset, lshift
>+    cmp   lshiftd, 0
>+    je .end
>+    shl   offsetq, 1
>+    sub   offsetq, mmsize*4
>+%ifdef ARCH_X86_64
>+    ; move lshift value from register to stack, then to m0
>+    sub       rsp, 4
>+    mov     [rsp], lshiftd
>+    movd       m0, [rsp]
>+    add       rsp, 4
>+%else
>+    ; move lshift value directly from stack to m0
>+    movd       m0, lshiftm
>+%endif

movd can take a gpr.

>+.loop
>+    mova       m1, [srcq+offsetq         ]
>+    mova       m2, [srcq+offsetq+mmsize  ]
>+    mova       m3, [srcq+offsetq+mmsize*2]
>+    mova       m4, [srcq+offsetq+mmsize*3]
>+    psllw      m1, m0
>+    psllw      m2, m0
>+    psllw      m3, m0
>+    psllw      m4, m0
>+    mova  [srcq+offsetq         ], m1
>+    mova  [srcq+offsetq+mmsize  ], m2
>+    mova  [srcq+offsetq+mmsize*2], m3
>+    mova  [srcq+offsetq+mmsize*3], m4
>+    sub   offsetq, mmsize*4
>+    jge .loop
>+.end

Colon after label.

--Loren Merritt



More information about the ffmpeg-devel mailing list