[FFmpeg-devel] [PATCH] Indeo5 decoder

Fri Apr 17 17:12:01 CEST 2009

Michael Niedermayer schrieb:
> On Fri, Apr 17, 2009 at 01:44:30PM +0200, Maxim wrote:
>   
>> Michael Niedermayer schrieb:
>>     
>>> On Tue, Apr 07, 2009 at 05:08:34PM +0200, Maxim wrote:
>>>   
>>>       
>>>> Michael Niedermayer schrieb:
>>>>     
>>>>         
>>>>> On Tue, Apr 07, 2009 at 10:52:34AM +0200, Maxim wrote:
>>>>>   
>>>>>       
>>>>>           
>>>>>> Michael Niedermayer schrieb:
>>>>>>     
>>>>>>         
>>>>>>             
>>>>>>> On Mon, Apr 06, 2009 at 08:41:57PM +0200, Maxim wrote:
>>>>>>>       
>>>>>>>           
>>>>>>>               
>>>>> [...]
>>>>>   
>>>>>       
>>>>>           
>>>>>>>> +
>>>>>>>> +
>>>>>>>> +/**
>>>>>>>> + *  Build static indeo5 dequantization tables.
>>>>>>>> + */
>>>>>>>> +static av_cold void build_dequant_tables(void)
>>>>>>>> +{
>>>>>>>> +    int         mat, i, lev;
>>>>>>>> +    uint32_t    q1, q2, sf1, sf2;
>>>>>>>> +
>>>>>>>> +    for (mat = 0; mat < 5; mat++) {
>>>>>>>> +        /* build 8x8 intra/inter tables for all 24 quant levels */
>>>>>>>> +        for (lev = 0; lev < 24; lev++) {
>>>>>>>> +            sf1 = ivi5_scale_quant_8x8_intra[mat][lev];
>>>>>>>> +            sf2 = ivi5_scale_quant_8x8_inter[mat][lev];
>>>>>>>> +
>>>>>>>> +            for (i = 0; i < 64; i++) {
>>>>>>>> +                q1 = (ivi5_base_quant_8x8_intra[mat][i] * sf1) >> 8;
>>>>>>>> +                q2 = (ivi5_base_quant_8x8_inter[mat][i] * sf2) >> 8;
>>>>>>>> +                deq8x8_intra[mat][lev][i] = av_clip(q1, 1, 255);
>>>>>>>> +                deq8x8_inter[mat][lev][i] = av_clip(q2, 1, 255);
>>>>>>>>     
>>>>>>>>         
>>>>>>>>             
>>>>>>>>                 
>>>>>>> 1..255 but they arent uint8_t 
>>>>>>> av_clip() seems useless  and the whole table precalc maybe as well
>>>>>>>   
>>>>>>>       
>>>>>>>           
>>>>>>>               
>>>>>> They were made uint16_t in order to achieve a compatibility with Indeo4
>>>>>> that uses 9bits dequant tables...
>>>>>> The table precalculation should help avoiding huge static tables...
>>>>>>     
>>>>>>         
>>>>>>             
>>>>> let me clarify my question, what is gained by merging a multiply and shift
>>>>> into the table?
>>>>> is it faster? if so then by how much?
>>>>>       
>>>>>           
>> I did some research on that! Here are answers on your questions:
>>
>> Question: Is it faster? if so then how much?
>>
>> Yes, it's faster. I measured calc "time" using START/STOP_TIMER macs. I
>> did two tests on two different videos: one containing mostly light
>> colors (DPS190indeo.avi) and another containing mostly dark colors
>> (haegemonia.avi). The reason for this choice was that the light colors
>> require higher scalefactors to be used and therefore a multiply by a
>> higher number.
>> First test measured dezicycles consumed by the inverse quantization
>> using TABLE lookup/MUL. It was done in my x86 Laptop equipped with the
>> Indel Core Duo processor at 2 GHz. Here are the raw numbers:
>>     
>
> could you show me the used code?
> Iam interrested to see how you did the MUL
>   

in the "decode_block":

START_TIMER;

q = (base_tab[pos] * scale_tab[quant]) >> 8;
q = (q) ? q : 1;

if (q != 1 && val) {
     if (val > 0) {
          val = (val * q) + (q >> 1) - (q & 1);
     } else
          val = (val * q) - (q >> 1) + (q & 1);
}
trvec[pos] = val;
col_flags[pos & col_mask] |= !!val; /* track columns containing non-zero
coeffs */

STOP_TIMER("inverse_quant");

The tables pointers base_tab and scale_tab are prepared appropriately in
"decode_band"...

Regards
Maxim