[FFmpeg-devel] [PATCH] Chinese AVS encoder
Stefan Gehrer
stefan.gehrer
Thu Jul 26 22:34:15 CEST 2007
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hi
Michael Niedermayer wrote:
>>>>> [...]
>>>>>> +/**
>>>>>> + * eliminate residual blocks that only have insignificant coefficients,
>>>>>> + * inspired from x264 and JVT-B118
>>>>>> + */
>>>>>> +static inline int decimate_block(uint8_t *run, DCTELEM *level, int count) {
>>>>>> + static const uint8_t run_score[30] = {
>>>>>> + 0,3,3,3,3,2,2,2,2,2,2,2,2,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0 };
>>>>>> + int i;
>>>>>> + int score = 0;
>>>>>> +
>>>>>> + if(count>4)
>>>>>> + return 9;
>>>>>> + for(i=0;i<count;i++) {
>>>>>> + int abslevel = FFABS(level[i]);
>>>>>> + if(abslevel > 1)
>>>>>> + return 9;
>>>>>> + score += run_score[FFMIN(run[i],29)];
>>>>>> + }
>>>>>> + return score;
>>>>>> +}
>>>>> how much psnr/bitrate is gained by this? if none then please drop it
>>>> I tested on the popular foreman (300 CIF frames). When encoding at
>>>> around 800kbps, roughly 0.1dB are gained by this. When encoding at
>>>> around 200kbps, this increases to a gain of around 0.15dB.
>>>> So I would like to keep it.
>>> ok, but i think there is something wrong elsewhere if there is such a
>>> large gain from this ...
>>> maybe a simple change to the bias would help? or maybe there is something
>>> else wrong with the quantization code ...
>>>
>> why? It seems clear to me that coding a macroblock as skipped is so much
>> cheaper than coding type/cbp/mv/coefficients that it might be worth
>> throwing some real information away. By real I mean coefficients that
>> come out of a fully correct quantization code, though I am not claiming
>> my code to be that.
>
> you should just make RD optimal decissions (IIRC you do calculate the RD all
> over the place anyway), if you do no heuristic like above will help
Okay, I think I see what you mean. Currently I use RD-decisions to
select intra prediction modes and for the motion estimation, but for
the macroblock mode decision there are some heuristics roughly like
this:
if(only insignificant coefficients)
encode mb as skipped
else
RD = P motion estimation
if(RD < threshold)
encode mb as p
else
encode mb as intra
By using RD optimal decisions do you mean I should encode every
macroblock in all three modes (skipped/p/intra) and select the
best of three, or rather I should try one mode and if its RD
cost is below a certain threshold I should take it, otherwise
I should move on to the next mode?
>
>>> [...]
>>>>>> + for(i=0;i<15;i++)
>>>>>> + if((ff_frame_rate_tab[i].den == frame_rate.den) &&
>>>>>> + (ff_frame_rate_tab[i].num == frame_rate.num))
>>>>>> + frame_rate_code = i;
>>>>>> + if(frame_rate_code < 0) {
>>>>>> + av_log(h->s.avctx, AV_LOG_ERROR, "unsupported framerate %d/%d\n",
>>>>>> + frame_rate.num, frame_rate.den);
>>>>>> + return -1;
>>>>>> + }
>>>>> [...]
>>>>>> + put_bits(&s->pb,16,0);
>>>>>> + put_bits(&s->pb,16,CAVS_START_CODE);
>>>>> add a put_bits_long() to bitstrea.c, document the issue with 32bits and
>>>>> fix vorbis_enc.c :)
>>>> I looked a bit into this and I think a better way would be to fix
>>>> put_bits() to support up to 32 bits instead of up to 31 bits.
>>>> I am not so sure, but after a bit of checking in vorbis_enc.c it
>>>> seems there is never any writing with more than 32 bits even
>>>> though the local put_bits() function takes a 64bit argument.
>>> argh, ive not noticed that vorbis is using a private put_bits
>>> iam starting to hate vorbis_enc.c ...
>>>
>>>
>>>> Attached a new encoder patch
>>>> and a proposed fix for put_bits().
>>> the fix to put_bits is rejected due to a likely huge slowdown
>>>
>> It is in the else branch and only executed every 32 bits of bitstream
>> writing, so I don't think the slowdown will be "huge". But if it's
>> not acceptable I will think about adding a put_bits_long again.
>
> the slowdown would affect all codecs, and i think it would be a significant
> slowdown
> its totally unacceptable just so you can replace
> put_bits(16); put_bits(16);
> by
> put_bits(32)
> in 1 or 2 codecs
> and without any benchmarks ...
>
So the problem is:
h263.c has put_bits_(16);put_bits(16);
cavsenc.c has put_bits_(16);put_bits(16);
vorbis_enc.c has it's own put_bits implementation
Solutions:
keep as is (you complained)
fix as I proposed (too slow)
create a proper put_bits_long function (seems overkill)
what about a silly and untested
void put_bits_long(PutBitContext *s, int n, uint64_t value) {
int rem = n>>1;
put_bits(s,n-rem,value>>rem);
put_bits(s,rem,value&((1<<rem)-1));
}
to encode 0..62 bits ?
>
> [...]
>>> [...]
>>>> + h->dist[0] = (h->picture.poc - h->DPB[0].poc + 512) % 512;
>>>> + h->dist[1] = (h->picture.poc - h->DPB[1].poc + 512) % 512;
>>> &511 ?
>>>
>> It is not speed critical, and then I find the modulo more readable.
>
> its +% vs &, ( 2 vs. 1 operations) so the modulo likely does make the
> binary by a byte or 2 larger ...
>
>
> [...]
>>> [...]
>>>> + h->levels[0] = av_mallocz(6*64*sizeof(DCTELEM));
>>>> + h->runs[0] = av_mallocz(6*64);
>>>> + for(i=1;i<6;i++) {
>>>> + h->levels[i] = h->levels[i-1] + 64;
>>>> + h->runs[i] = h->runs[i-1] + 64;
>>>> + }
>>> why is this allocated with malloc instead of being part of the context?
>> It keeps the context smaller when only the decoder is running.
>
> hmm, ok, still its not ideal, as this requires an extra pointer
> dereference to access
Ok, I will change this in the next round.
Stefan
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iD8DBQFGqQVHYpYOlpT3vNMRAsF+AKCXSeEzTDNKssm3PkHFHsVQpzXgJwCg5nI2
F7ydbEy1vt/XaBQjvA34GKs=
=4EuI
-----END PGP SIGNATURE-----
More information about the ffmpeg-devel
mailing list