[FFmpeg-devel] VP8 decoder optimization status

Måns Rullgård mans
Wed Jun 30 22:46:35 CEST 2010


Jason Garrett-Glaser <darkshikari at gmail.com> writes:

> On Wed, Jun 30, 2010 at 1:19 PM, Stefan Gehrer <stefan.gehrer at gmx.de> wrote:
>> On 06/30/2010 10:15 PM, Stefan Gehrer wrote:
>>>
>>> On 06/30/2010 08:54 PM, Jason Garrett-Glaser wrote:
>>>>
>>>> On Wed, Jun 30, 2010 at 8:55 AM, Stefan Gehrer<stefan.gehrer at gmx.de>
>>>> wrote:
>>>>>
>>>>> On 06/29/2010 04:09 AM, Jason Garrett-Glaser wrote:
>>>>>>
>>>>>> Here's a rough guide to what's done and what needs to be done before
>>>>>> ffmpeg's VP8 decoder is as fast as a politician running away from an
>>>>>> ethics committee.
>>>>>
>>>>> [...]
>>>>>
>>>>>> C:
>>>>>>
>>>>>> Fully convert vp5/6/7/8 arithmetic coder to bytestream: eliminate the
>>>>>> looped renormalization.
>>>>>
>>>>> Like attached?
>>>>
>>>> We should try to reuse the h264 table if possible, IMO.
>>>
>>> If we are talking about the same table (ff_h264_norm_shift*),
>>> it can not be used as is,
>>> I think these are the options:
>>>
>>> 1. shift = 7 - av_log2_16bit(c->high);
>>>
>>> 2. shift = 7 - ff_log2_tab[c->high];
>>>
>>> 3. shift = ff_h264_norm_shift_old[c->high] + !!c->high;
>>
>> 3. shift = ff_h264_norm_shift_old[c->high] + 1;
>>
>> as c->high should never become zero.
>
> This sounds like the best option: 1) is 3 ops on x86 minimum, as is
> 2).  3) is at most two ops.

But a clz is often faster than a memory load.  Also don't forget you
have to load the base address of the table from somewhere on
everything but x86 (and there too with PIC).

-- 
M?ns Rullg?rd
mans at mansr.com



More information about the ffmpeg-devel mailing list