[FFmpeg-devel] [PATCH 5/6] x86/hevcdsp: add ff_hevc_sao_edge_filter_8_{ssse3, avx2}

James Almer jamrial at gmail.com
Wed Feb 4 21:14:12 CET 2015


On 04/02/15 9:39 AM, Christophe Gisquet wrote:
> Are the first number for each case from before you split out the
> restore part? Otherwise, that's gruesome.

The benchmarks were done with every patch up to this one applied, so yes, after the 
split. The file i used to bench went from ~36fps to ~46fps after this patch.
The C version must be pretty inefficient (That CMP macro inside the loop probably 
creates lots of branches). Or maybe GCC was dumb.

> As seen from above, srcstride is constant and is 2*MAX_PB_SIZE +
> FF_INPUT_BUFFER_PADDING_SIZE.
> That may save you one whole gpr. Not really useful here, but I think
> you are more limited for the>8 bits case.
> If you want to exploit this, also add it above void (*sao_edge_filter[5])

Ah, good to know it's constant now. Although until we add x86_32 versions of these 
functions it doesn't really bring any real benefit.
I'll update the prototype and assembly anyway.

For that matter, do .asm files have access to FF_INPUT_BUFFER_PADDING_SIZE? if at 
some point we change its value (For example, once AVX512 code starts being 
committed), then srcstride will be something else.
Probably not a problem since whenever that constant is updated in avcodec.h it 
can also be updated in hevc_sao.asm, but it would be nice not having to bother 
doing that.


More information about the ffmpeg-devel mailing list