[FFmpeg-devel] [PATCH] E-AC-3 spectral extension

Sat Dec 6 16:35:08 CET 2008

Michael Niedermayer wrote:
> On Fri, Dec 05, 2008 at 08:28:51PM -0500, Justin Ruggles wrote:
>> Michael Niedermayer wrote:
>>> On Thu, Dec 04, 2008 at 05:21:15PM -0500, Justin Ruggles wrote:
>>>> Michael Niedermayer wrote:
>>>>> On Wed, Dec 03, 2008 at 10:41:59PM -0500, Justin Ruggles wrote:
>>>>>> Michael Niedermayer wrote:
>>>>>>> On Sat, Nov 15, 2008 at 07:10:53PM -0500, Justin Ruggles wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> Here is an updated version of E-AC-3 spectral extension support.  I have
>>>>>>>> addressed the issues mentioned in Michael's review in ffmpeg-cvslog.
>>>>>>>>
>>>>>>>> - should not be expoitable now
>>>>>>>> - uses smaller int types for context arrays
>>>>>>>> - got rid of unused variables
>>>>>>>> - the code to determine copy_sizes and wrapflag is 70% faster
>>>>>>>> - merged the 2 loops for notch filters
>>>>>>>> - moved multiplies outside the loop for signal & noise blending
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Justin
>>>>> [...]
>>>>>>>> @@ -818,15 +820,92 @@
>>>>>>>>  
>>>>>>>>      /* spectral extension strategy */
>>>>>>>>      if (s->eac3 && (!blk || get_bits1(gbc))) {
>>>>>>>> -        if (get_bits1(gbc)) {
>>>>>>>> -            av_log_missing_feature(s->avctx, "Spectral extension", 1);
>>>>>>>> -            return -1;
>>>>>>>> +        s->spx_in_use = get_bits1(gbc);
>>>>>>>> +        if (s->spx_in_use) {
>>>>>>>> +            int begf, endf;
>>>>>>>> +            int spx_end_subband;
>>>>>>>> +
>>>>>>>> +            /* determine which channels use spx */
>>>>>>>> +            if (s->channel_mode == AC3_CHMODE_MONO) {
>>>>>>>> +                s->channel_in_spx[1] = 1;
>>>>>>>> +            } else {
>>>>>>>> +                for (ch = 1; ch <= fbw_channels; ch++)
>>>>>>>> +                    s->channel_in_spx[ch] = get_bits1(gbc);
>>>>>>>> +            }
>>>>>>>> +
>>>>>>>> +            s->spx_copy_start_freq = get_bits(gbc, 2) * 12 + 25;
>>>>>>>> +            begf = get_bits(gbc, 3);
>>>>>>>> +            endf = get_bits(gbc, 3);
>>>>>>>> +            s->spx_start_subband = begf < 6 ? begf+2 : 2*begf-3;
>>>>>>>> +            spx_end_subband      = endf < 4 ? endf+5 : 2*endf+3;
>>>>>>>> +            if (s->spx_start_subband >= spx_end_subband) {
>>>>>>>> +                av_log(s->avctx, AV_LOG_ERROR, "invalid spectral extension range (%d >= %d)\n",
>>>>>>>> +                       s->spx_start_subband, spx_end_subband);
>>>>>>>> +                return -1;
>>>>>>>> +            }
>>>>>>> with code like this i always ask myself if its enough or not
>>>>>>> I mean this code is conditinal on a get_bits1(gbc) apparently otherwise
>>>>>>> reusing the last values.
>>>>>>> and the return -1 does leave some changed vars in the context, so it
>>>>>>> pretty likely can leave invalid vars in the context.
>>>>>>> I dont know if they are used or not after a return -1
>>>>>>> but the code is not very solid like that either way as it depends on some
>>>>>>> variables in the context not being used ...
>>>>>>> so IMHO there either should be no invalid values left or it should be
>>>>>>> documented in comments why its safe currently and appropriate notes
>>>>>>> should be placed in the code to prevent future mistaken use of the
>>>>>>> variables, but i guess its easier just not to store invalid values
>>>>>>> in the context to begin with
>>>>>> I have simplified things by committing a change that prevents decoding
>>>>>> subsequent blocks in a frame after an invalid block is found.  It was
>>>>>> pointless to try anyways.  This way it doesn't matter if invalid values
>>>>>> are in the context since the decoder will always re-read them before
>>>>>> using them in the next frame.
>>>>> what about:
>>>>>     /* block start information */
>>>>>     if (s->num_blocks > 1 && get_bits1(gbc)) {
>>>>>         /* reference: Section E2.3.2.27
>>>>>            nblkstrtbits = (numblks - 1) * (4 + ceiling(log2(words_per_frame)))
>>>>>            The spec does not say what this data is or what it's used for.
>>>>>            It is likely the offset of each block within the frame. */
>>>>>         int block_start_bits = (s->num_blocks-1) * (4 + av_log2(s->frame_size-2));
>>>>>         skip_bits(gbc, block_start_bits);
>>>>>     }
>>>>> ?
>>>>>
>>>>> with this it should be very well possible to decode blocks after a damaged
>>>>> block, or am i missing something?
>>>>> some bitstream dependancy between blocks that makes that impossible?
>>>>> or are these block offsets never stored?
>>>>> also above allows to do error checking, that is to check that the bitstream
>>>>> pointer matches the stored offset after each block.
>>>>> our mp3 decoder also decodes all "blocks" that are undamaged and out mpeg2
>>>>> decoder also decodes all slices that are undamaged not discarding thw whole
>>>>> frame or GOP after an error
>>>>>
>>>>> besides this should be skip_bits_long() instead of skip_bits()
>>>> Out of curiosity, I will check to see if any of the E-AC-3 samples I
>>>> have actually provide the block start information.
>>>>
>>>> Ainc AC-3 can optionally reuse quite a bit of information between blocks
>>>> (E-AC-3 even more so), this still would not help in many cases.  In
>>>> fact, nearly every single field can be reused from the previous block.
>>>> After a quick search through the spec, the only field I could find which
>>>> will always be in every single block and not reused from previous blocks
>>>> in both AC-3 and E-AC-3 is the frequency coeff mantissas.
>>> The question is how likely the reused parts are to be hit by an error and
>>> what exact effect that has.
>>> If for example 10% of each block is header bits and damage to them is fatal
>>> for all future blocks then a bit/byte error still has a 90% chance of not
>>> affecting future blocks.
>>> Of course one could argue that there is no point in making the decoder
>>> robust against bit/byte errors as packet/sector loss is a more common thing
>>> than having a bit or byte changed. Thats mainly because error correction
>>> at the transport layer will either correct bit errors or ruin the whole
>>> packet to the point of being useless.
>>> Still, bad memory, overclocked cpus and various other things could produce
>>> such errors in principlce but its not a terribly critical issue.
>>> So if you prefer to just discard future blocks in case of an error iam
>>> fine with that.
>> I do prefer to discard future blocks in case of an error.  The first
>> reason is that I probed all my E-AC-3 samples and none of them utilize
>> the block start info. 
> 
> thats a good argument and as already said, i agree with the discarding if you
> prefer it.
> 
> 
>> The second reason is that it would be a lot of
>> work for something that will almost never happen if error recognition is
>> used since the CRC check nearly always catches damaged frames and
>> discards them.
> 
> It discards them, but only because of the lack of more fine grained error
> detection, which in the event of a crc mismatch surely would lead to
> better results. Because the timespan that has to be filled in by something
> would be smaller.

Yes, true.

> 
>>  The third reason is that we currently don't even check
>> the other 95% for errors.  About 20% of the block (exponents) could be
>> checked, 
> 
>> but the errors are not critical,
> 
> what do you mean by not critical? all errors are generally critical
> Its a little like when you try to walk from point A to B, if during this you
> ever find the sun standing directly north [on the northen side of earth]
> you as well know you are in
> troubble, not that the sun as such being in an impossible place would be a
> problem, but more because that indicates that you lost sense of direction
> and likely are not where you think you are.
> Similarly finding an exponent that is outside some valid range may not be a
> problem as such but its a indication that what you think is a exponent may
> very well not be one at all and the following bits similarly likely wont
> be what you think they are
> Indeed most likely the actual error has happened long before the one you
> detected.

By not critical I mean that the only effect is an audible artifact that
is probably somewhat louder than if error concealment were used.  I'm
not saying there is no benefit, just that the I thought the cost was not
worth the very small benefit.  It turns out I was wrong. :)

I benchmarked the current way of clipping exponents against bounds
checking and it turns out that the bounds checking was actually a slight
bit faster, so I'll go ahead and apply the change.

> besides, had files block start positions stored they would be a rather
> reliable way on their own to detect fatal errors by just checking that
> after decoding each block the expected position is exactly reached.

I didn't think of that.  Do you think it would be worth putting an
"unsupported feature" log message for block start info so that someone
is more likely to report a sample?

Thanks,
Justin