[FFmpeg-devel] [PATCH] avfilter: add hflip x86 SIMD
Paul B Mahol
onemda at gmail.com
Sun Dec 3 22:13:12 EET 2017
On 12/3/17, Paul B Mahol <onemda at gmail.com> wrote:
> On 12/3/17, Martin Vignali <martin.vignali at gmail.com> wrote:
>> Maybe the problem come from the skip part :
>>
>> +INIT_XMM ssse3
>>> +cglobal hflip_byte, 3, 5, 3, src, dst, w, x, v
>>> + mova m0, [pb_flip_byte]
>>> + mov xq, 0
>>> + mov wd, dword wm
>>> + sub wq, 2 * mmsize
>>> + cmp wq, mmsize
>>> + jl .skip
>>> +
>>> + .loop0:
>>> + neg xq
>>> + movu m1, [srcq + xq - mmsize + 1]
>>> + movu m2, [srcq + xq - 2 * mmsize + 1]
>>> + pshufb m1, m0
>>> + pshufb m2, m0
>>> + neg xq
>>> + movu [dstq + xq ], m1
>>> + movu [dstq + xq + mmsize], m2
>>> + add xq, mmsize * 2
>>> + cmp xq, wq
>>> + jl .loop0
>>> +
>>> +.skip:
>>> + add wq, 2 * mmsize
>>>
>>
>> ==> use xq instead of wq ?
>
> Nope.
>
>>
>>
>>> + .loop1:
>>> + neg xq
>>> + mov vb, [srcq + xq]
>>> + neg xq
>>> + mov [dstq + xq], vb
>>> + add xq, 1
>>> + cmp xq, wq
>>> + jl .loop1
>>> +RET
>>> +
>>> +cglobal hflip_short, 3, 5, 3, src, dst, w, x, v
>>> + mova m0, [pb_flip_short]
>>> + mov xq, 0
>>> + mov wd, dword wm
>>> + add wq, wq
>>> + sub wq, 2 * mmsize
>>> + cmp wq, mmsize
>>> + jl .skip
>>> +
>>> + .loop0:
>>> + neg xq
>>> + movu m1, [srcq + xq - mmsize + 2]
>>> + movu m2, [srcq + xq - 2 * mmsize + 2]
>>> + pshufb m1, m0
>>> + pshufb m2, m0
>>> + neg xq
>>> + movu [dstq + xq ], m1
>>> + movu [dstq + xq + mmsize], m2
>>> + add xq, mmsize
>>> + cmp xq, wq
>>> + jl .loop0
>>> +
>>> +.skip:
>>> + add wq, 2 * mmsize
>>>
>>
>>
>> ==> same here ?
>
> Nope, This is for case when width is not multiple of mmsize.
>
Can I get final verdict? I would like to move to other things.
More information about the ffmpeg-devel
mailing list