[FFmpeg-devel] [PATCH 5/6] x86/hevcdsp: add ff_hevc_sao_edge_filter_8_{ssse3, avx2}
James Almer
jamrial at gmail.com
Wed Feb 4 21:14:12 CET 2015
On 04/02/15 9:39 AM, Christophe Gisquet wrote:
> Are the first number for each case from before you split out the
> restore part? Otherwise, that's gruesome.
The benchmarks were done with every patch up to this one applied, so yes, after the
split. The file i used to bench went from ~36fps to ~46fps after this patch.
The C version must be pretty inefficient (That CMP macro inside the loop probably
creates lots of branches). Or maybe GCC was dumb.
> As seen from above, srcstride is constant and is 2*MAX_PB_SIZE +
> FF_INPUT_BUFFER_PADDING_SIZE.
> That may save you one whole gpr. Not really useful here, but I think
> you are more limited for the>8 bits case.
> If you want to exploit this, also add it above void (*sao_edge_filter[5])
Ah, good to know it's constant now. Although until we add x86_32 versions of these
functions it doesn't really bring any real benefit.
I'll update the prototype and assembly anyway.
For that matter, do .asm files have access to FF_INPUT_BUFFER_PADDING_SIZE? if at
some point we change its value (For example, once AVX512 code starts being
committed), then srcstride will be something else.
Probably not a problem since whenever that constant is updated in avcodec.h it
can also be updated in hevc_sao.asm, but it would be nice not having to bother
doing that.
More information about the ffmpeg-devel
mailing list