[FFmpeg-devel] [PATCH] x86/hevc_sao: make sao_band_filter work on x86_32

Christophe Gisquet christophe.gisquet at gmail.com
Sun Feb 8 19:41:36 CET 2015


Hi,

2015-02-08 18:48 GMT+01:00 James Almer <jamrial at gmail.com>:
>>> +    %assign MMSIZE mmsize
>>
>> Why do that? Not a big deal: it's only for my education, if there's
>> something I'm missing.
>
> For width 48, the COMPUTE macro is last run after an INIT_XMM cpuname, so mmsize becomes
> 16 and in the avx2 version the instructions would access the wrong data in stack.
> Doing %assign MMSIZE mmsize at the beginning of the function and using it here makes sure
> it's always 32 in avx2.
> sse2 is unaffected by this, of course.
>
> And the reason I'm using INIT_XMM in the middle of the function for the avx2 width 48 case
> is because i couldn't find a nice and clean way to use the xm* reg aliases with the COMPUTE
> macros.

OK. Strange that it still compiled fine on Win32 here, but I haven't
looked at the code generated, and I can't run the avx2 version.

>>> +cglobal hevc_sao_band_filter_%1_8, 6, 6, 15, 8*mmsize*ARCH_X86_32, dst, src, dststride, srcstride, offset, left
>>>      HEVC_SAO_BAND_FILTER_INIT 8
>>
>> Why do you need room for 8 regs, and not 7?
>
> Remnant from before i realized i could keep m7 untouched. I'll change it.

If that's the only change, then, unless someone complains, just push
that version.

Thanks,
-- 
Christophe


More information about the ffmpeg-devel mailing list