[FFmpeg-devel] FW: [PATCH] 10-bit DNxHD decoding and encoding

Baptiste Coudurier baptiste.coudurier at gmail.com
Wed Mar 30 17:44:15 CEST 2011


Hi Joseph,

Can you please send an updated patch as well ?

On 03/30/2011 03:11 AM, Joseph Artsimovich wrote:
> Baptiste, could you give feedback on my previous message in this thread (reproduced below)?
> Others are welcome to contribute to this discussion as well.
> 
> 
> Joseph Artsimovich
> Software Developer
> MirriAd Ltd
> 
> 
> 
> -----Original Message-----
> From: ffmpeg-devel-bounces at ffmpeg.org [mailto:ffmpeg-devel-bounces at ffmpeg.org] On Behalf Of Joseph Artsimovich
> Sent: Thursday, March 24, 2011 9:51 AM
> To: FFmpeg development discussions and patches
> Subject: Re: [FFmpeg-devel] [PATCH] 10-bit DNxHD decoding and encoding
> 
>>> -static const uint16_t dnxhd_1238_run_codes[62] = {
>>> +static const uint16_t dnxhd_1235_1238_1241_run_codes[62] = {
>>>      0, 4, 10, 11, 24, 25, 26, 27, 56, 57, 58, 59, 120, 242, 486, 
>>> 487, 488, 489, 980, 981, 982, 983, 984, 985, 986, 987, 988, 989, 
>>> 990, 991, 992, 993, 994, 995, 996, 997, 998, 999, 1000, 1001, 1002, 
>>> 1003, 1004, 1005, 1006, 1007, 1008, 1009, 1010, 1011, 1012, 1013, 
>>> 1014, 1015, 1016, 1017, 1018, 1019, 1020, 1021, 1022, 1023,
>>
>> Please rename in a separate patch.
> Done.  The patch is attached.
> 
>>> @@ -40,6 +43,7 @@ typedef struct {
>>>      VLC ac_vlc, dc_vlc, run_vlc;
>>>      int last_dc[3];
>>>      DSPContext dsp;
>>> +    int dsp_bits; // 8, 10 or 0 if not initialized.
>>
>> bit_depth is a better name IMHO.
> Well, this field was meant to indicate how the DSPContext was initialized (if at all).
> For other purposes we have cid_table->bit_depth

Well, this is only one function. I don't think reconfi

>>> +static av_always_inline void dnxhd_10bit_get_pixels_8x8(DCTELEM
>>> +*restrict block, const uint16_t *pixels, int line_size) {
>>> +    for (int i = 0; i < 8; ++i) {
>>> +        for (int j = 0; j < 8; ++j) {
>>> +            block[j] = (DCTELEM)floor((float)pixels[j] * (1023.0f /
>>> +65535.0f) + 0.5f);
>>
>> Btw, why are you using floats here ?
> Hoping a compiler might vectorize this code.

We prefer using ints here, to avoid rounding difference on different archs.

>>> -        dnxhd_get_blocks(ctx, mb_x, mb_y);
>>> +        if (ctx->cid_table->bit_depth == 10) {
>>> +            dnxhd_10bit_get_blocks(ctx, mb_x, mb_y);
>>> +        } else {
>>> +            dnxhd_get_blocks(ctx, mb_x, mb_y);
>>> +        }
>>
>> I think a function pointer in context would be better.
> OK.
> 
> 
>>>          for (i = 0; i < 8; i++) {
>>>              DCTELEM *src_block = ctx->blocks[i]; @@ -439,6 +549,8 
>>> @@ static int dnxhd_calc_bits_thread(AVCodecContext *avctx, void 
>>> *arg, int
>> jobnr, i
>>>              diff = block[0] - ctx->m.last_dc[n];
>>>              if (diff < 0) nbits = av_log2_16bit(-2*diff);
>>>              else          nbits = av_log2_16bit( 2*diff);
>>> +
>>> +            assert(nbits < ctx->cid_table->bit_depth + 4);
>>
>> assert would kill the program.
> That would indicate a serious bug.  The assert merely checks we are not reading past the end of a table.

It's a problem, but can this actually happen ?

> BTW, I've come by the following comment in mpegvideo_enc.c:
> //non c quantize code returns incorrect block_last_index FIXME If still true, that could trigger this assert.
> 
> 
>>> +// To get DCT coefficients multiplied by 4.
>>
>> Shouldn't we avoid returning scaled output, could be usefull for 11,12 
>> bits, etc..
> Well, I guess it's done to get more precise quantization.  I would be fine with having an unscaled version though.  After all, the range of DCT coefficients is already 3 bits larger than the range of input samples.
> 
> 
>>> [...]
>>>
>>> diff --git a/libavcodec/mpegvideo.h b/libavcodec/mpegvideo.h index 
>>> 3e447ab..eb2352f 100644
>>> --- a/libavcodec/mpegvideo.h
>>> +++ b/libavcodec/mpegvideo.h
>>> @@ -49,7 +49,19 @@ enum OutputFormat {  #define MPEG_BUF_SIZE (16
>> *
>>> 1024)
>>>
>>>  #define QMAT_SHIFT_MMX 16
>>> -#define QMAT_SHIFT 22
>>> +
>>> +/*
>>> + * QMAT_SHIFT should be as high as possible, but not so high to 
>>> +cause
>>> + * overflow in the following expression:
>>> + * scaled_dct[i] * ((p / s) * ((1 << QMAT_SHIFT) / (qscale *
>>> +weight_table[i]))))
>>> + * where s is a scale factor applied to DCT coefficients while
>>> + * p and weight_table are codec-specific.
>>> + * For example, the worst case for 10-bit DNxHD would be:
>>> + * p == 32, s == 4, qscale == 1, weight_table[i] == 32, 
>>> +scaled_dct[i] == (1 << 13) * 4
>>> + * which gives (1 << 15) * (((1 << QMAT_SHIFT) >> 5) << 3)
>>> + * which gives 18 as the largest safe QMAT_SHIFT.
>>> + */
>>> +#define QMAT_SHIFT 18
>>>
>>>  #define MAX_FCODE 7
>>>  #define MAX_MV 2048
>>> diff --git a/libavcodec/mpegvideo_enc.c b/libavcodec/mpegvideo_enc.c 
>>> index 9255fa8..3f46033 100644
>>> --- a/libavcodec/mpegvideo_enc.c
>>> +++ b/libavcodec/mpegvideo_enc.c
>>> @@ -3725,9 +3725,14 @@ int dct_quantize_c(MpegEncContext *s,
>>>              else
>>>                  q = s->c_dc_scale;
>>>              q = q << 3;
>>> -        } else
>>> +        } else {
>>>              /* For AIC we skip quant/dequant of INTRADC */
>>> -            q = 1 << 3;
>>> +            if (s->dsp.fdct == ff_faandct_10bit_safe) {
>>> +                q = 1 << 2;
>>> +            } else {
>>> +                q = 1 << 3; // The post-scale applied to a DCT output.
>>> +            }
>>> +        }
>>>
>>>          /* note: block[0] is assumed to be positive */
>>>          block[0] = (block[0] + (q >> 1)) / q;
>>
>> dct_quantize_mmx/sse2/etc .. is used if present, see QMAX_SHIFT_MMX.
> I specifically request FF_DCT_10BIT, so I won't get any of those.

Ultimately you can only replace the dct in dct_quantize.
Besides, this modification makes a change that is specific to dnxhd in
generic mpeg code, which I'm not a big fan of.

>> I think it's time to separate the dnxhd quantization fonction from the 
>> mpeg one and change QMAT_SHIFT only there.
>> Remove MpegEncContext from DNxHDEncContext, add dnxhd_template_mmx.c 
>> in
>> x86 and clean it up (intra only, out format, aic only etc..)
> I believe we only use dct_quantize() from MpegEncContext, and there exists quite a few assembly versions of that.  I don't think it's a good idea to duplicate that code.

The code has overflow problems already due to the shift.

-- 
Baptiste COUDURIER
Key fingerprint                 8D77134D20CC9220201FC5DB0AC9325C5C1ABAAA
FFmpeg maintainer                                  http://www.ffmpeg.org


More information about the ffmpeg-devel mailing list