[FFmpeg-devel] [PATCH] Mimic decoder

Michael Niedermayer michaelni
Sun Mar 16 03:46:54 CET 2008


On Sat, Mar 15, 2008 at 06:47:03PM -0300, Ramiro Polla wrote:
> Hello,
>
> [...]
>>>>> /*
>>>>>  * vlc_decode_block
>>>>>  *
>>>>>  * De-serialize (reconstruct) a variable length coded 8x8 block.
>>>>>  */
>>>>> static int vlc_decode_block(MimicContext *ctx, DCTELEM *block, int 
>>>>> num_coeffs)
>>>>> {
>>>>>     unsigned int pos;
>>>>>
>>>>>     memset(block, 0, 64 * sizeof(DCTELEM));
>>>> clear_blocks() in dsp
>>> clear_blocks() clears 6*64*sizeof(DCTELEM). On mimic it's only 
>>> 64*sizeof(DCTELEM). Is there another function that will clear only what I 
>>> need?
>> Hmm, not that i remember, anyway you can move the memset to after the idct
>> that way it will be in the cache when its cleared.
>
> Unless I misunderstood you, the blocks are on the stack, so there's no 
> reason to clear them after the IDCT. If it's ok I'll leave that memset 
> there.

Stacks and alignment and gcc ...


[...]
>
>>>> and
>>>> value= 1 - (1<<num_bits);
>>>> if(num_bits>1)
>>>>     value+= get_bits(num_bits-1);
>>>> if(get_bits1())
>>>>     value -= value;
>>>> block[ctx->scantable.permutated[pos]] = value;
>>>> Or if you think the table matters speedwise then make it static dont
>>>> duplicate it in every decoder instance.
>>>           ref   calc  static
>>> mimic.o : 51868 51304 52192
>>> avg time: 1.692 1.712 1.728
>>>
>>> That's very odd... Shouldn't the static table be at least as fast as the
>>> current code? 
>> Yes
>> If you want to debug it the asm output from gcc would be a start but i
>> doubt its worth the time spend.
>
> With a bigger sample, I got
>           ref   calc  static
> avg time: 6.688 6.932 6.744

And the standard deviations? Mean is only one side of the story ...


>
>>> Attached are the 2 patches I used to test. (lut_calc.diff and 
>>> lut_static.diff). Anyways, most FFmpeg stuff introduced a considerable 
>>> speedup over the original code, so I'm tempted to apply one of these two.
>>> Could I use the hardcoded tables ifdef there?
>> I dont see much sense in it. I mean the calc code doesnt need the table at
>> all and needs around 100 byte while the table needs 1024.
>
> If it's ok I'd prefer to keep the current code with vlcdec_lookup in 
> MimicContext.

You can keep it if you merge the *qscale into the table as its not
const then anymore ...


>
>> [...]
>>>         if(vlc == -1)
>>>             return 0;
>>>         if(!vlc) /* end-of-block code */
>>>             return 1;
>> maybe
>> if(vlc<=0)
>>     return !vlc;
>> would be faster (no iam not suggesting this if its not faster)
>
> It's also slower (1.7%), so I'll just leave it as is.

hmm the end-of-block check should be before the invalid check at least
because it will be more often true.
also as you already tried various, what about

if(vlc<=0)
    return vlc;

with the others adjusted accordingly

or

pos += vlc & (256+15);
if(pos >= 64)
    return 0;

(that is leaving the end of block check but merging the vlc==-1 check with
the pos >=64 check)
This one should be faster in theory of course gcc is unpredictable


[...]
>         /* FFmpeg's IDCT behaves somewhat different from the original code, so
>          * a factor of 4 was added to the input */
> 
>         coeff = ctx->vlcdec_lookup[num_bits][value];
>         if(pos<3)
>             coeff <<= 4;
>         else /* TODO Use >> 10 instead of / 1001 */
>             coeff = (coeff * qscale) / 1001;

Can you elaborate a little about these 1024 vs. 1001 and 
"different from the original" ?


[...]
> /**
>  * Internal helper-function used to initialize
>  * the lookup-table used by the VLC-decoder.
>  */
> static void initialize_vlcdec_lookup(int8_t lookup_tbl[8][128])
> {
>     int i, j;
> 
>     for(i = 1 ; i < 8 ; i++) {
>         int first = (1<<i)-1;
>         int last = 1<<(i-1);
>         int cur = first;

cur - first is redundant remove one please


[...]
>     ctx->picture.data[0] = NULL;
>     if(avctx->get_buffer(avctx, &ctx->picture)) {
>         av_log(avctx, AV_LOG_ERROR, "get_buffer() failed\n");
>         return -1;
>     }

reference=1 is missing before get_buffer()


[...]
> AVCodec mimic_decoder = {
>     "mimic",
>     CODEC_TYPE_VIDEO,
>     CODEC_ID_MIMIC,
>     sizeof(MimicContext),
>     mimic_decode_init,
>     NULL,
>     mimic_decode_end,
>     mimic_decode_frame
> };

CODEC_CAP_DR1 is missing


[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Good people do not need laws to tell them to act responsibly, while bad
people will find a way around the laws. -- Plato
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20080316/ad133b58/attachment.pgp>



More information about the ffmpeg-devel mailing list