[FFmpeg-devel] [PATCH] MLP/TrueHD Decoder - 2nd try

Tue Jul 1 02:55:39 CEST 2008

Hello Michael,

Michael Niedermayer wrote:
> On Sun, Jun 29, 2008 at 12:41:17PM +0100, Ramiro Polla wrote:
>> Hello,
>>
> [...]
>> mlpdec.c is Ian's MLP/TrueHD decoder with a few modifications as can be 
>> seen in the SoC tree.
>>
> 
>> Valgrind doesn't complain about invalid writes on a bunch of trashed files 
>> and some manually edited files that pass the checks as well.
> 
> I assume that you disabled all checksum checks for these tests?

Oops. A bunch of more tests run with checksums disabled. Still no 
problem seen.

> [...]
>> static const char* sample_message =
>>     "Please file a bug report following the instructions at "
>>     "http://ffmpeg.mplayerhq.hu/bugreports.html and include "
>>     "a sample of this file.";
> 
> Dont we already have something like that somewhere? Or was that another
> unfinished patch, either way above can be shared wth other codecs ...

IIRC it's in Robert's AAC patch. I haven't checked the archives to make 
sure though. Anyways I think it's not in SVN yet.

>> typedef struct MLPDecodeContext {
>>     AVCodecContext *avctx;
>>
>>     //! Do we have valid stream data read from a major sync block?
>>     uint8_t     params_valid;
>>
>>     //! Number of substreams contained within this stream
>>     uint8_t     num_substreams;
>>
>>     //! Index of the last substream to decode - further substreams are skipped
>>     uint8_t     max_decoded_substream;
>>
>>     //! Number of PCM samples contained in each frame
>>     int         access_unit_size;
>>     //! Next power of two above the number of samples in each frame
>>     int         access_unit_size_pow2;
>>
>>     //! For each substream, whether a restart header has been read
>>     uint8_t     restart_seen[MAX_SUBSTREAMS];
> 
> I think it would be more readable if the comments where aligned to the
> right of the variables

And have really long lines? I usually don't like these comments anyways, 
so I can just change it to what people on the list find best.

>>     //@{
>>     /** Restart header data */
>>     //! The sync word used at the start of the last restart header
>>     uint16_t    restart_sync_word[MAX_SUBSTREAMS];
>>
>>     //! The index of the first channel coded in this substream
>>     uint8_t     min_channel[MAX_SUBSTREAMS];
>>     //! The index of the last channel coded in this substream
>>     uint8_t     max_channel[MAX_SUBSTREAMS];
>>     //! The number of channels input into the rematrix stage
>>     uint8_t     max_matrix_channel[MAX_SUBSTREAMS];
>>
>>     //! The left shift applied to random noise in 0x31ea substreams
>>     uint8_t     noise_shift[MAX_SUBSTREAMS];
>>     //! The current seed value for the pseudorandom noise generator(s)
>>     uint32_t    noisegen_seed[MAX_SUBSTREAMS];
>>
>>     //! Does this substream contain extra info to check the size of VLC blocks?
>>     uint8_t     data_check_present[MAX_SUBSTREAMS];
>>
>>     //! Bitmask of which parameter sets are conveyed in a decoding parameter block
>>     uint8_t     param_presence_flags[MAX_SUBSTREAMS];
>> #define PARAM_BLOCKSIZE     (1 << 7)
>> #define PARAM_MATRIX        (1 << 6)
>> #define PARAM_OUTSHIFT      (1 << 5)
>> #define PARAM_QUANTSTEP     (1 << 4)
>> #define PARAM_FIR           (1 << 3)
>> #define PARAM_IIR           (1 << 2)
>> #define PARAM_HUFFOFFSET    (1 << 1)
>>     //@}
>>
>>     //@{
>>     /** Matrix data */
>>
>>     //! Number of matrices to be applied
>>     uint8_t     num_primitive_matrices[MAX_SUBSTREAMS];
>>
>>     //! Output channel of matrix
>>     uint8_t     matrix_ch[MAX_SUBSTREAMS][MAX_MATRICES];
>>
>>     //! Whether the LSBs of the matrix output are encoded in the bitstream
>>     uint8_t     lsb_bypass[MAX_SUBSTREAMS][MAX_MATRICES];
>>     //! Matrix coefficients, stored as 2.14 fixed point
>>     int32_t     matrix_coeff[MAX_SUBSTREAMS][MAX_MATRICES][MAX_CHANNELS+2];
>>     //! Left shift to apply to noise values in 0x31eb substreams
>>     uint8_t     matrix_noise_shift[MAX_SUBSTREAMS][MAX_MATRICES];
>>     //@}
>>
>>     //! Left shift to apply to huffman-decoded residuals
>>     uint8_t     quant_step_size[MAX_SUBSTREAMS][MAX_CHANNELS];
>>
>>     //! Number of PCM samples in current audio block
>>     uint16_t    blocksize[MAX_SUBSTREAMS];
>>     //! Number of PCM samples decoded so far in this frame
>>     uint16_t    blockpos[MAX_SUBSTREAMS];
>>
>>     //! Left shift to apply to decoded PCM values to get final 24-bit output
>>     int8_t      output_shift[MAX_SUBSTREAMS][MAX_CHANNELS];
>>
>>     //@{
>>     /** Filter data. */
>> #define FIR 0
>> #define IIR 1
>>     //! Number of taps in filter
>>     uint8_t     filter_order[MAX_CHANNELS][2];
>>     //! Right shift to apply to output of filter
>>     uint8_t     filter_coeff_q[MAX_CHANNELS][2];
>>
>>     uint8_t     filter_index[MAX_CHANNELS][2];
>>
>>     int32_t     filter_coeff[MAX_CHANNELS][2][MAX_FILTER_ORDER];
>>     int32_t     filter_state[MAX_CHANNELS][2][MAX_FILTER_ORDER];
>>     //@}
>>
>>     //@{
>>     /** Sample data coding infomation */
>>     //! Offset to apply to residual values
>>     int16_t     huff_offset[MAX_CHANNELS];
>>     //! Sign/rounding corrected version of huff_offset
>>     int32_t     sign_huff_offset[MAX_CHANNELS];
>>     //! Which VLC codebook to use to read residuals
>>     uint8_t     codebook[MAX_CHANNELS];
>>     //! Size of residual suffix not encoded using VLC
>>     uint8_t     huff_lsbs[MAX_CHANNELS];
>>     //@}
>>
>>     //! Running XOR of all output samples
>>     int32_t     lossless_check_data[MAX_SUBSTREAMS];
>>
>>     int8_t      noise_buffer[MAX_BLOCKSIZE_POW2];
>>     int8_t      bypassed_lsbs[MAX_BLOCKSIZE][MAX_CHANNELS];
>>     int32_t     sample_buffer[MAX_BLOCKSIZE][MAX_CHANNELS+2];
>> } MLPDecodeContext;
> 
> There are so many blah[substr] all over the place, maybe putting these
> in a SubStream struct would simplify the code?

What kind of simplifications did you have in mind? The result of a 
half-job looked uglier to me. Attached substr_struct.diff patch is what 
I started doing.

> [...]
>> /** Initialize the decoder. */
>>
>> static int mlp_decode_init(AVCodecContext *avctx)
> 
> you dont really need a mlp_ before static functions in mlp*.c

I don't think there's any harm in keeping it. Lots of other files do the 
same.

> and the doxy comment is close to useless

Removed.

> [...]
>>     m->min_channel       [substr] = get_bits(gbp, 4);
>>     m->max_channel       [substr] = get_bits(gbp, 4);
>>     m->max_matrix_channel[substr] = get_bits(gbp, 4);
>>
>>     if (m->min_channel[substr] > m->max_channel[substr]) {
>>         av_log(m->avctx, AV_LOG_ERROR,
>>                "Substream min channel cannot be greater than max channel.\n");
>>         m->min_channel[substr] = m->max_channel[substr]
>>             = m->max_matrix_channel[substr] = 0;
>>         return -1;
>>     }
> 
> i wonder if it would be cleaner to read into local vars and assign to the
> structs after validation but then maybe not i dunno

I removed the setting to zero since it's no longer needed and it doesn't 
look unclean now IMO.

> [...]
>> static int read_filter_params(MLPDecodeContext *m, GetBitContext *gbp,
>>                               unsigned int channel, unsigned int filter)
>> {
>>     const char fchar = filter ? 'I' : 'F';
>>     int i, order;
>>
>>     // filter is 0 for FIR, 1 for IIR
>>     assert(filter < 2);
>>
>>     order = get_bits(gbp, 4);
>>     if (order > MAX_FILTER_ORDER) {
>>         av_log(m->avctx, AV_LOG_ERROR,
>>                "%cIR filter order %d is greater than maximum %d\n",
>>                fchar, order, MAX_FILTER_ORDER);
>>         return -1;
>>     }
>>     m->filter_order[channel][filter] = order;
>>
>>     if (order > 0) {
>>         int coeff_bits, coeff_shift;
>>
>>         m->filter_coeff_q[channel][filter] = get_bits(gbp, 4);
>>
> 
>>         coeff_bits = get_bits(gbp, 5);
>>         coeff_shift = get_bits(gbp, 3);
> 
> vertical align

I knew I had missed one...

> [...]
>>                 m->matrix_ch[substr][mat] = get_bits(gbp, 4);
>>                 frac_bits = get_bits(gbp, 4);
>>                 m->lsb_bypass[substr][mat] = get_bits1(gbp);
>>
>>                 if (m->matrix_ch[substr][mat] > m->max_channel[substr]) {
>>                     av_log(m->avctx, AV_LOG_ERROR,
>>                            "Invalid channel %d specified as output from matrix\n",
>>                            m->matrix_ch[substr][mat]);
>>                     m->matrix_ch[substr][mat] = 0;
>>                     return -1;
>>                 }
> 
> is the = 0 still needed? MAX_CHANNELS is 16 and sample_buffer is large enough
> apparently for that.

It's not needed anymore. That and another one removed.

> [...]
>>     if (m->param_presence_flags[substr] & PARAM_OUTSHIFT)
>>         if (get_bits1(gbp)) {
>>             for (ch = 0; ch <= m->max_matrix_channel[substr]; ch++) {
>>                 m->output_shift[substr][ch] = get_bits(gbp, 4);
>>                 dprintf(m->avctx, "output shift[%d] = %d\n",
>>                         ch, m->output_shift[substr][ch]);
>>                 /* TODO: validate */
>>             }
>>         }
>>
>>     if (m->param_presence_flags[substr] & PARAM_QUANTSTEP)
>>         if (get_bits1(gbp))
>>             for (ch = 0; ch <= m->max_channel[substr]; ch++) {
>>                 m->quant_step_size[substr][ch] = get_bits(gbp, 4);
>>                 /* TODO: validate */
>>
>>                 calculate_sign_huff(m, substr, ch);
>>             }
> 
> inconsistant {}

Made consistent.

> [...]
>> /** Generate a PCM sample using the prediction filters and a residual value
>>  *  read from the data stream, and update the filter state.
>>  */
>>
>> static int filter_sample(MLPDecodeContext *m, unsigned int substr,
>>                          unsigned int channel, int32_t residual)
>> {
>>     unsigned int i, j, index;
>>     int64_t accum = 0;
>>     int32_t result;
>>
>>     /* TODO: Move this code to DSPContext? */
>>
>> #define INDEX(channel, order, pos) \
>>     ((m->filter_index[channel][order] + (pos)) & (MAX_FILTER_ORDER - 1))
>>
>>     for (j = 0; j < 2; j++)
>>         for (i = 0; i < m->filter_order[channel][j]; i++)
>>             accum += (int64_t)m->filter_state[channel][j][INDEX(channel,j,i)] *
>>                      m->filter_coeff[channel][j][i];
>>
>>     accum = accum >> m->filter_coeff_q[channel][FIR];
>>     result = (accum + residual)
>>                 & ~((1 << m->quant_step_size[substr][channel]) - 1);
>>
>>     index = INDEX(channel, FIR, -1);
>>     m->filter_state[channel][FIR][index] = result;
>>     m->filter_index[channel][FIR] = index;
>>
>>     index = INDEX(channel, IIR, -1);
>>     m->filter_state[channel][IIR][index] = result - accum;
>>     m->filter_index[channel][IIR] = index;
>>
>> #undef INDEX
>>
>>     return result;
>> }
> 
> Why the complicated indexing?
> how big would filter_state be if it where a normal flat array? 160 entries?

It would need
sizeof(int32_t) * MAX_BLOCKSIZE * 2 * MAX_CHANNELS = 20480 = 20k
bytes more in the context, and attached filter_index_flat_array.diff 
patch. I assume that's not ok =)

>> /** Read a block of PCM residual (or actual if no filtering active) data.
>>  */
>>
>> static int read_block_data(MLPDecodeContext *m, GetBitContext *gbp,
>>                            unsigned int substr)
>> {
>>     unsigned int i, mat, ch, expected_stream_pos = 0;
>>
> 
>>     if (m->data_check_present[substr])
>>         expected_stream_pos = get_bits_count(gbp) + get_bits(gbp, 16);
>>         /* UNTESTED - find an example stream */
> 
> tell the user to upload one if above is true

Done.

>>     if (m->blockpos[substr] + m->blocksize[substr] > m->access_unit_size) {
>>         av_log(m->avctx, AV_LOG_ERROR, "Too many audio samples in frame\n");
>>         return -1;
>>     }
>>
>>     memset(&m->bypassed_lsbs[m->blockpos[substr]][0], 0,
>>            m->blocksize[substr] * MAX_CHANNELS);
> 
> instead of MAX_CHANNELS a sizeof would be nicer

Changed.

>>     for (i = 0; i < m->blocksize[substr]; i++) {
>>         for (mat = 0; mat < m->num_primitive_matrices[substr]; mat++)
>>             if (m->lsb_bypass[substr][mat])
>>                 m->bypassed_lsbs[i + m->blockpos[substr]][mat] = get_bits1(gbp);
>>
>>         for (ch = m->min_channel[substr]; ch <= m->max_channel[substr]; ch++) {
>>             int32_t sample;
>>
> 
>>             if (read_huff(m, gbp, substr, ch, &sample) < 0)
>>                 return -1;
> 
> please use the return value instead of passing pointers to it

sample may be positive or negative, so a simple < check won't help. But 
sample must not be bigger than 24 bits.

Is attached read_huff_return_bad_code.diff patch ok?

> [...]
>>         for (i = 0; i < m->blockpos[substr]; i++) {
>>             int64_t accum = 0;
>>             for (src_ch = 0; src_ch <= maxchan; src_ch++) {
>>                 accum += (int64_t)m->sample_buffer[i][src_ch]
>>                                   * m->matrix_coeff[substr][mat][src_ch];
>>             }
>>             if (m->matrix_noise_shift[substr][mat]) {
>>                 uint32_t index = m->num_primitive_matrices[substr] - mat;
>>                 index = (i * (index * 2 + 1) + index) & (m->access_unit_size_pow2 - 1);
>>                 accum += m->noise_buffer[index] << (m->matrix_noise_shift[substr][mat] + 7);
>>             }
>>             m->sample_buffer[i][dest_ch] = ((accum >> 14) & ~((1 << m->quant_step_size[substr][dest_ch]) - 1))
>>                                              + m->bypassed_lsbs[i][mat];
>>         }
> 
> following may be faster ...
> 
> index = m->num_primitive_matrices[substr] - mat;
> index2= 2*index+1;
> for (i = 0; i < m->blockpos[substr]; i++) {
> ...
>     if (m->matrix_noise_shift[substr][mat]) {
>         index &= m->access_unit_size_pow2 - 1;
>         accum += m->noise_buffer[index] << (m->matrix_noise_shift[substr][mat] + 7);
>         index += index2;
>     }

I don't have any samples that touch this code. I'm still waiting for Ian 
to send me more samples (and I'm searching for some more also).

Do you prefer if I go ahead and change it anyways?

> [...]
>> static int read_access_unit(AVCodecContext *avctx, void* data, int *data_size,
>>                             const uint8_t *buf, int buf_size)
>> {
>>     MLPDecodeContext *m = avctx->priv_data;
>>     GetBitContext gb;
>>     unsigned int length, substr;
>>     unsigned int substream_start;
>>     unsigned int header_size = 0;
>>     uint8_t substream_parity_present[MAX_SUBSTREAMS];
>>     uint16_t substream_data_len[MAX_SUBSTREAMS];
>>     uint8_t parity_bits = 0;
>>
>>     if (buf_size < 4)
>>         return 0;
>>
>>     length = (AV_RB16(buf) & 0xfff) * 2;
>>
>>     if (length > buf_size)
>>         return -1;
>>
>>     PARITY_2BYTES;
>>     PARITY_2BYTES;
>>
>>     init_get_bits(&gb, buf, (length - 4) * 8);
>>
>>     if (show_bits_long(&gb, 31) == (0xf8726fba >> 1)) {
>>         dprintf(m->avctx, "Found major sync\n");
>>         if (read_major_sync(m, &gb) < 0)
>>             goto error;
>>         header_size += 28;
>>         buf += 28;
>>     }
>>
>>     if (!m->params_valid) {
>>         av_log(m->avctx, AV_LOG_WARNING,
>>                "Stream parameters not seen; skipping frame\n");
>>         *data_size = 0;
>>         return length;
>>     }
>>
>>     substream_start = 0;
>>
>>     for (substr = 0; substr < m->num_substreams; substr++) {
>>         int extraword_present, checkdata_present, end;
>>
>>         extraword_present = get_bits1(&gb);
>>         skip_bits1(&gb);
>>         checkdata_present = get_bits1(&gb);
>>         skip_bits1(&gb);
>>
>>         end = get_bits(&gb, 12) * 2;
>>
>>         PARITY_2BYTES;
> 
> isnt that doing the same as calculate_parity() ? if so why is it using these
> MACROS spread over the whole code ...

Cleaned-up a bit using calculate_parity().

>>         if (extraword_present) {
>>             skip_bits(&gb, 16);
>>             PARITY_2BYTES;
>>         }
>>
>>         if (end + header_size > length) {
>>             av_log(m->avctx, AV_LOG_ERROR,
>>                    "Substream %d data indicated length goes off end of packet.\n",
>>                    substr);
>>             end = length - header_size;
>>         }
>>
>>         if (end < substream_start) {
>>             av_log(avctx, AV_LOG_ERROR,
>>                    "Substream %d data indicated end offset "
>>                    "is before calculated start offset.\n",
>>                    substr);
>>             goto error;
>>         }
>>
>>         if (substr > m->max_decoded_substream)
>>             continue;
>>
>>         substream_parity_present[substr] = checkdata_present;
>>         substream_data_len[substr] = end - substream_start;
>>         substream_start = end;
>>     }
>>
>>     if ((((parity_bits >> 4) ^ parity_bits) & 0xF) != 0xF) {
>>         av_log(avctx, AV_LOG_ERROR, "Parity check failed.\n");
>>         goto error;
>>     }
>>
>>     for (substr = 0; substr <= m->max_decoded_substream; substr++) {
>>         init_get_bits(&gb, buf, substream_data_len[substr] * 8);
>>
>>         m->blockpos[substr] = 0;
>>         do {
>>             if (get_bits1(&gb)) {
>>                 if (get_bits1(&gb)) {
>>                     /* A restart header should be present */
>>                     if (read_restart_header(m, &gb, buf, substr) < 0)
>>                         goto error;
>>                     m->restart_seen[substr] = 1;
>>                 }
>>
>>                 if (!m->restart_seen[substr]) {
>>                     av_log(m->avctx, AV_LOG_ERROR,
>>                            "No restart header present in substream %d.\n",
>>                            substr);
>>                     goto error;
>>                 }
>>
>>                 if (read_decoding_params(m, &gb, substr) < 0)
>>                     goto error;
> 
> Does an error in one substream affect the others? if not these hard errors
> is not ideal.

Now it tries the next substream if one fails.

Ramiro Polla
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mlpdec.c
Type: text/x-csrc
Size: 39622 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20080701/673fb98e/attachment.c>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: filter_index_flat_array.diff
Type: text/x-diff
Size: 3302 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20080701/673fb98e/attachment.diff>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: read_huff_return_bad_code.diff
Type: text/x-diff
Size: 1715 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20080701/673fb98e/attachment-0001.diff>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: substr_struct.diff
Type: text/x-diff
Size: 22076 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20080701/673fb98e/attachment-0002.diff>