[FFmpeg-devel] [PATCH] E-AC-3 spectral extension
Michael Niedermayer
michaelni
Mon Jun 1 12:20:54 CEST 2009
On Sat, May 30, 2009 at 11:43:12PM -0400, Justin Ruggles wrote:
> Michael Niedermayer wrote:
> > On Sun, May 17, 2009 at 02:23:34PM -0400, Justin Ruggles wrote:
> >> Hi,
> >>
> >> I was recently made aware that some French TV station(s) will soon (if
> >> not already) start using E-AC-3 streams in their broadcasts which
> >> utilize spectral extension. I was also given some samples (thanks j-b
> >> and Anthony), which I uploaded to mphq:
> >> http://samples.mplayerhq.hu/A-codecs/AC3/eac3/csi_miami_*
> >>
> >> So I decided to revisit my SPX patch. The previous version was done
> >> with all integer arithmetic, but it turns out that it's really not
> >> accurate enough for spectal extension processing. The resulting decoded
> >> output had a max bandwidth of about 2kHz less when using 24-bit fixed
> >> point vs. floating point, and was only slightly higher than without any
> >> SPX processing at all. Making just the square roots floating point
> >> raised the bandwidth about 1kHz, and making the rest (noise/signal
> >> scaling, spx coords, and notch filters) floating point added about
> >> another 1kHz.
> >>
> >> I was able to compare the output to Nero's E-AC-3 decoder (thanks
> >> madshi), and the results are very close considering that AC-3 uses
> >> random noise for zero-bit mantissas:
> >
> >> stddev: 131.16 PSNR: 53.96
> >
> > i wouldnt call 131.16 close
>
> Well, considering I don't know how the Nero decoder differs, it's not
> bad. I don't know how the Nero decoder ends up with higher bandwidth
> than it should, it very likely uses a different random noise generator,
> and it could do dithering in the float-to-int16 conversion.
dither in float2int might account for ~1.0 stdev maybe but we are 2
magnitudes above that.
about the PRNG, well just decode a AC3 with 2 different PRNGS and compare
by how much they differ
also you can take neros output and ours and create a wav file with the
sample wise differences.
looking at that / listening to it might provide a hint about what is that
differs.
>
> >> PEAQ ODG: -0.44
> >
> > what is PEAQ ODG ?
>
> PEAQ is an ITU standard for perceptual evaluation of audio quality. ODG
> is the objective difference grade. It tries to objectively estimate
> what results might be from a subjective listening test by using a
> psychoacoustic model. The method has its flaws, but it's a heck of a
> lot simpler than setting up a multi-user double-blind listening test for
> each change.
>
> 0 = Imperceptible
> -1 = Perceptible, but not annoying
> -2 = Slightly annoying
> -3 = Annoying
> -4 = Very annoying
ok, understood
>
> > btw, have you tested your code with our trasher / some fuzzer to make sure
> > it doesnt segfault?
>
> Yes, and it does not segfault even with -er 0. The values read from the
> stream which affect reading/writing from memory are bounds checked. The
> (E)AC-3 decoder already does fairly well with damaged streams, and that
> is no different after this change.
>
> >
> >> One thing I'm unsure about is whether I should add optional runtime
> >> generation of the attenuation table rather than always hardcoding it.
> >
> > i think due to the relatively small size there is little point
>
> ok.
>
> >
> > [...]
> >
> >> diff --git a/libavcodec/ac3dec.c b/libavcodec/ac3dec.c
> >> index c176cb3..e6d7a9d 100644
> >> --- a/libavcodec/ac3dec.c
> >> +++ b/libavcodec/ac3dec.c
> >> @@ -825,14 +825,94 @@ static int decode_audio_block(AC3DecodeContext *s, int blk)
> >>
> >> /* spectral extension strategy */
> >> if (s->eac3 && (!blk || get_bits1(gbc))) {
> >> - if (get_bits1(gbc)) {
> >> - ff_log_missing_feature(s->avctx, "Spectral extension", 1);
> >> - return -1;
> >> + s->spx_in_use = get_bits1(gbc);
> >> + if (s->spx_in_use) {
> >> + int begf, endf;
> >> + int spx_end_subband;
> >> +
> >> + /* determine which channels use spx */
> >> + if (s->channel_mode == AC3_CHMODE_MONO) {
> >> + s->channel_in_spx[1] = 1;
> >> + } else {
> >> + for (ch = 1; ch <= fbw_channels; ch++)
> >> + s->channel_in_spx[ch] = get_bits1(gbc);
> >> + }
> >> +
> >> + s->spx_copy_start_freq = get_bits(gbc, 2) * 12 + 25;
> >> + begf = get_bits(gbc, 3);
> >> + endf = get_bits(gbc, 3);
> >> + s->spx_start_subband = begf < 6 ? begf+2 : 2*begf-3;
> >> + spx_end_subband = endf < 4 ? endf+5 : 2*endf+3;
> >> + if (s->spx_start_subband >= spx_end_subband) {
> >> + av_log(s->avctx, AV_LOG_ERROR, "invalid spectral extension range (%d >= %d)\n",
> >> + s->spx_start_subband, spx_end_subband);
> >> + return -1;
> >> + }
> >> + s->spx_start_freq = s->spx_start_subband * 12 + 25;
> >> + s->spx_end_freq = spx_end_subband * 12 + 25;
> >> + if (s->spx_copy_start_freq >= s->spx_start_freq) {
> >> + av_log(s->avctx, AV_LOG_ERROR, "invalid spectral extension copy start bin (%d >= %d)\n",
> >> + s->spx_copy_start_freq, s->spx_start_freq);
> >> + return -1;
> >> + }
> >
> > you know, i always have a bad feeling when various variables are updated
> > first and checked afterwards but left in an invalid state anyway
> >
> > are you sure this is all free of buffer overflows? (ive not checked so
> > it may very well be ok ...)
>
> No further blocks are read in the frame after a block decode fails.
> Each frame is independent, so the next frame is not affected by an
> invalid state. Also, it was previously discussed and agreed upon that
> trying to read subsequent blocks after a failed block is pointless since
> there are no known streams which use the block start info.
ok
[..]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
Thouse who are best at talking, realize last or never when they are wrong.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20090601/b27235cc/attachment.pgp>
More information about the ffmpeg-devel
mailing list