[FFmpeg-devel] [PATCH] Chinese AVS encoder

Michael Niedermayer michaelni
Thu Jul 26 13:12:43 CEST 2007


Hi

On Thu, Jul 26, 2007 at 07:34:26AM +0200, Stefan Gehrer wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Michael Niedermayer wrote:
> > Hi
> > 
> > On Wed, Jul 25, 2007 at 07:50:17AM +0200, Stefan Gehrer wrote:
> > [...]
> >>> [...]
> >>>> +/**
> >>>> + * eliminate residual blocks that only have insignificant coefficients,
> >>>> + * inspired from x264 and JVT-B118
> >>>> + */
> >>>> +static inline int decimate_block(uint8_t *run, DCTELEM *level, int count) {
> >>>> +    static const uint8_t run_score[30] = {
> >>>> +        0,3,3,3,3,2,2,2,2,2,2,2,2,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0 };
> >>>> +    int i;
> >>>> +    int score = 0;
> >>>> +
> >>>> +    if(count>4)
> >>>> +        return 9;
> >>>> +    for(i=0;i<count;i++) {
> >>>> +        int abslevel = FFABS(level[i]);
> >>>> +        if(abslevel > 1)
> >>>> +            return 9;
> >>>> +        score += run_score[FFMIN(run[i],29)];
> >>>> +    }
> >>>> +    return score;
> >>>> +}
> >>> how much psnr/bitrate is gained by this? if none then please drop it
> >> I tested on the popular foreman (300 CIF frames). When encoding at
> >> around 800kbps, roughly 0.1dB are gained by this. When encoding at
> >> around 200kbps, this increases to a gain of around 0.15dB.
> >> So I would like to keep it.
> > 
> > ok, but i think there is something wrong elsewhere if there is such a
> > large gain from this ... 
> > maybe a simple change to the bias would help? or maybe there is something
> > else wrong with the quantization code ...
> > 
> 
> why? It seems clear to me that coding a macroblock as skipped is so much
> cheaper than coding type/cbp/mv/coefficients that it might be worth
> throwing some real information away. By real I mean coefficients that
> come out of a fully correct quantization code, though I am not claiming
> my code to be that.

you should just make RD optimal decissions (IIRC you do calculate the RD all
over the place anyway), if you do no heuristic like above will help


> 
> > 
> > [...]
> >>>> +    for(i=0;i<15;i++)
> >>>> +        if((ff_frame_rate_tab[i].den == frame_rate.den) &&
> >>>> +           (ff_frame_rate_tab[i].num == frame_rate.num))
> >>>> +            frame_rate_code = i;
> >>>> +    if(frame_rate_code < 0) {
> >>>> +        av_log(h->s.avctx, AV_LOG_ERROR, "unsupported framerate %d/%d\n",
> >>>> +               frame_rate.num, frame_rate.den);
> >>>> +        return -1;
> >>>> +    }
> >>>
> >>> [...]
> >>>> +        put_bits(&s->pb,16,0);
> >>>> +        put_bits(&s->pb,16,CAVS_START_CODE);
> >>> add a put_bits_long() to bitstrea.c, document the issue with 32bits and
> >>> fix vorbis_enc.c :)
> >> I looked a bit into this and I think a better way would be to fix
> >> put_bits() to support up to 32 bits instead of up to 31 bits.
> >> I am not so sure, but after a bit of checking in vorbis_enc.c it
> >> seems there is never any writing with more than 32 bits even
> >> though the local put_bits() function takes a 64bit argument.
> > 
> > argh, ive not noticed that vorbis is using a private put_bits
> > iam starting to hate vorbis_enc.c ...
> > 
> > 
> >> Attached a new encoder patch 
> > 
> >> and a proposed fix for put_bits().
> > 
> > the fix to put_bits is rejected due to a likely huge slowdown
> > 
> 
> It is in the else branch and only executed every 32 bits of bitstream
> writing, so I don't think the slowdown will be "huge". But if it's
> not acceptable I will think about adding a put_bits_long again.

the slowdown would affect all codecs, and i think it would be a significant
slowdown
its totally unacceptable just so you can replace
put_bits(16); put_bits(16);
by
put_bits(32)
in 1 or 2 codecs
and without any benchmarks ...


[...]
> > 
> > [...]
> >> +    h->dist[0] = (h->picture.poc - h->DPB[0].poc  + 512) % 512;
> >> +    h->dist[1] = (h->picture.poc - h->DPB[1].poc  + 512) % 512;
> > 
> > &511 ?
> > 
> 
> It is not speed critical, and then I find the modulo more readable.

its +% vs &, ( 2 vs. 1 operations) so the modulo likely does make the
binary by a byte or 2 larger ...


[...]
> > 
> > [...]
> >> +        h->levels[0] = av_mallocz(6*64*sizeof(DCTELEM));
> >> +        h->runs[0]   = av_mallocz(6*64);
> >> +        for(i=1;i<6;i++) {
> >> +            h->levels[i] = h->levels[i-1] + 64;
> >> +            h->runs[i]   = h->runs[i-1]   + 64;
> >> +        }
> > 
> > why is this allocated with malloc instead of being part of the context?
> 
> It keeps the context smaller when only the decoder is running.

hmm, ok, still its not ideal, as this requires an extra pointer
dereference to access

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Observe your enemies, for they first find out your faults. -- Antisthenes
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20070726/b3291657/attachment.pgp>



More information about the ffmpeg-devel mailing list