[FFmpeg-devel] [PATCH] Add filter_limit to VP3 loop filter paramaters

Jason Garrett-Glaser darkshikari
Sat Oct 4 11:05:24 CEST 2008


On Sat, Oct 4, 2008 at 1:49 AM, Diego Biurrun <diego at biurrun.de> wrote:
> On Sat, Oct 04, 2008 at 12:17:03AM -0400, David Conrad wrote:
>>
>> The C implementation of the VP3 loop filter uses a table of pre-
>> calculated values for the filter function. This doesn't work so well in
>> SIMD; it's faster to calculate the function with the limit value.
>>
>> Implementing the C version without the table is slightly faster on my
>> Penryn, but about 20% slower on P4, so it's probably not worth it to
>> replace the C version to get rid of the table completely.
>
> Depending on how much speedup you get elsewhere I think you can ignore
> P4 benchmark numbers.  The P4 is a deadborn child.

Indeed.  If you have to take into account Pentium 4, you have to write
custom versions of all of your assembly to use pshufw instead of movq
for moves between registers, because on pentium 4 moves between SIMD
registers take an absurd 6 cycles.

And since I don't think anyone is willing to do that, we can pretty
much ignore it in most all other code, too.

Dark Shikari




More information about the ffmpeg-devel mailing list