[FFmpeg-soc] Instability in AMR-NB
Colin McQuillan
m.niloc at googlemail.com
Tue Aug 4 14:32:00 CEST 2009
This mail is mainly for my mentor but any suggestions are welcome.
I've been running AMR test sequences. Raw sound files are encoded by
the reference encoder and decoded by the SoC AMR decoder. I noticed
that a certain test produces a loud crescendo buzz that sounds like
amp feedback. (The test is T19.INP - Male speech, ambient noise,
active speech level: -36.0 dBov, modified IRS frequency response, with
many zero frames.)
ACELP is based on calculating an excitation vector using feedback. The
decoder does something like this every 40 samples:
for (i = 0; i < 40; i++) {
excitation[i] *= pitch_factor;
excitation[i] += fixed_vector[i];
}
where pitch_factor and fixed_vector[i] are input to the decoder.
(There's also a shift to the excitation vector but that's irrelevant
for now.)
The encoder has its own idea of what the excitation vector is and
calculates pitch_factor accordingly. When the AMR encoder is encoding
zeros and it also believes the excitation is zero, for some reason it
will output a pitch_factor > 1.0. If the decoder also has zero
excitation at this point, everything is fine.
If the decoder actually has a non-zero excitation, this produces an
exponentially increasing output. The T19 test produces frames with
pitch_factor=1.049988 most of the time when the input is zero. This
produces the buzz in the SoC AMR decoder.
This seems like a significant flaw in AMR: small changes in the input
can produce a big effect. I constructed the attached files, based on
the test sequence T19, to demonstrate that small changes can affect
decoding. You can try them out with the reference decoder or ffmpeg's
libopencore_amrnb. The files unstable-zero.amr and unstable-one.amr
are identical except for the first frame (21 byte frame, AMR-NB
7.95kbit/s).
1,2c1,2
< 0000000 2123 4d41 0a52 d12c cc1a 1f00 fff9 82e1
< 0000010 0066 c01f ff03 20e8 0082 2c00 1ad1 00cc
---
> 0000000 2123 4d41 0a52 5d2c 5986 a40e 044b cf84
> 0000010 a532 1ed0 3219 7ad1 6953 2c9e 1ad1 00cc
I could set the excitation vector to zero whenever it is close it
zero, but this could be brittle. Should I just leave this as a bug?
-------------- next part --------------
A non-text attachment was scrubbed...
Name: unstable-zero.amr
Type: audio/amr
Size: 2106 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-soc/attachments/20090804/67fc9d71/attachment.amr>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: unstable-one.amr
Type: audio/amr
Size: 2106 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-soc/attachments/20090804/67fc9d71/attachment-0001.amr>
More information about the FFmpeg-soc
mailing list