[FFmpeg-soc] [PATCH] AMR-WB Decoder
Vitor Sessak
vitor1001 at gmail.com
Thu Sep 9 20:35:32 CEST 2010
On 09/09/2010 07:42 PM, Marcelo Galvão Póvoa wrote:
> 2010/9/9 Marcelo Galvão Póvoa<marspeoplester at gmail.com>:
>> On 9 September 2010 05:11, Vitor Sessak<vitor1001 at gmail.com> wrote:
>>> On 09/09/2010 02:50 AM, Marcelo Galvão Póvoa wrote:
>>>>
>>>> On 8 September 2010 06:54, Vitor Sessak<vitor1001 at gmail.com> wrote:
>>>>>
>>>>> On 09/07/2010 03:21 AM, Marcelo Galvão Póvoa wrote:
>>>>>>
>>>>>> Hello,
>>>>>>
>>>>>> On 6 September 2010 10:13, Ronald S. Bultje<rsbultje at gmail.com>
>>>>>> wrote:
>>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> On Mon, Sep 6, 2010 at 5:54 AM, Vitor Sessak<vitor1001 at gmail.com>
>>>>>>> wrote:
>>>>>>>>
>>>>>>>> On 09/06/2010 02:46 AM, Marcelo Galvão Póvoa wrote:
>>>>>>>>>
>>>>>>>>> Ok, fortunately I've found the bug!
>>>>>>>>>
>>>>>>>>> It was just a MIN_ISF_SPACING parameter which I extracted from the
>>>>>>>>> reference code but was unsure about it's Q level. After some time, I
>>>>>>>>> thought I have it figured out but I was wrong. Now I know the answer
>>>>>>>>> the hard way...
>>>>>>>>>
>>>>>>>>> The clipping and the sharp peaks are gone, the waveforms are really
>>>>>>>>> close now!
>>>>>>>>
>>>>>>>> That's great news!
>>>>>>>>
>>>>>>>>> Also, the stddev against the reference decoder decreased a
>>>>>>>>> lot (it was ~884 before):
>>>>>>>>> all_men.awb stddev: 51.72 PSNR: 62.05 MAXDIFF: 1089 bytes:
>>>>>>>>> 473600/
>>>>>>>>> 473600
>>>>>>>>
>>>>>>>> stddev of 51 looks pretty good to me for this case.
>>>>>>>
>>>>>>> Maxdiff of 1089 looks like a lot to me, with a low stddev that
>>>>>>> suggests that one particular part is off. Can you trace which part is
>>>>>>> off and why (phase shift vs. actual bug)?
>>>>>>>
>>>>>>
>>>>>> Sorry, but can you suggest a way of doing it?
>>>>>
>>>>> Your method of inverting one sample and summing in audacity would work on
>>>>> showing where it is happening (some point will have an amplitude of
>>>>> 1089).
>>>>> To know if is a phase shift or a bug, you will have to compare visually
>>>>> both
>>>>> waves.
>>>>>
>>>>
>>>> I don't know exactly how to detect a phase shift this way but the
>>>> difference waveform I obtained [1] has some peaks at the sibilant
>>>> parts I think. Probably just where the high band is louder.
>>>>
>>>>>> Also, what do MAXDIFF
>>>>>> and the "2" at the end of the command line mean?
>>>>>
>>>>> MAXDIFF is the biggest difference among two samples. The "2" at the end
>>>>> of
>>>>> the command line says to read two-byte integers (16-bit). If you were
>>>>> comparing video pixels, you would use "1". You can also see the source of
>>>>> tiny_psnr.c, it is pretty simple.
>>>>>
>>>>>> This sample have long silence parts and I'm comparing my floating
>>>>>> point implementation to the reference 16-bit fixed point. How close
>>>>>> you think they should be?
>>>>>
>>>>> A very small stddev (< 1.00) would assure there is no bug in your
>>>>> decoder,
>>>>> but the fact that it is large does not means there is one.
>>>>>
>>>>> I suggest you do the following test:
>>>>>
>>>>> a) Get a biggish file (> 30 minutes)
>>>>> b) Convert it to the a WAV with the sample rate and number of channel the
>>>>> AMR encoder takes as input
>>>>> c) Encode the file obtained in (b) it with the reference encoder
>>>>> d) Decode the file obtained in (c) with the reference decoder
>>>>> e) Decode the file obtained in (c) with ffamrwb
>>>>> f) Compare the stddev of files obtained in (b) and (d) with that of (b)
>>>>> and
>>>>> (e). If file decoded with ffamr are as close to the original as that
>>>>> decoded
>>>>> with the reference decoder, it's good.
>>>>>
>>>>
>>>> Results:
>>>> $ ./tests/tiny_psnr ~/ref_pod.wav ~/orig_pod.wav 2
>>>> stddev: 2599.69 PSNR: 28.03 MAXDIFF:39592 bytes: 76480640/ 76480660
>>>> $ ./tests/tiny_psnr ~/my_pod.wav ~/orig_pod.wav 2
>>>> stddev: 2600.02 PSNR: 28.03 MAXDIFF:39653 bytes: 76480640/ 76480660
>>>> $ ./tests/tiny_psnr ~/my_pod.wav ~/ref_pod.wav 2
>>>
>>>
>>> Hmm, the files have different sizes. Are you sure you are not comparing
>>> files shifted of a few bytes? One parameter of tiny_psnr is a shift between
>>> the two files. Does
>>>
>>> tiny_psnr ~/ref_pod.wav ~/orig_pod.wav 2 10
>>>
>>> Gives a much worse result? If the shift you are using now (zero) is correct,
>>> changing it to anything else should make stddev much bigger. Can you also
>>> put the wav files somewhere I can download? If you have a problem of quota,
>>> you can use our FTP server as explained in [1].
>>>
>>
>> Now that seems weird:
>>
>> $ ./tests/tiny_psnr ~/my_pod2.wav ~/orig_pod.wav 2 10
>> stddev: 2575.45 PSNR: 28.11 MAXDIFF:46172 bytes: 76480630/ 76480660
>>
>> $ ./tests/tiny_psnr ~/my_pod2.wav ~/orig_pod.wav 2 -20
>> stddev: 2524.06 PSNR: 28.29 MAXDIFF:45328 bytes: 76480640/ 76480640
>>
>> $ ./tests/tiny_psnr ~/my_pod2.wav ~/orig_pod.wav 2 200
>> stddev: 2090.99 PSNR: 29.92 MAXDIFF:46167 bytes: 76480440/ 76480660
>>
>> I will upload the files later. Probably it will take some time.
>>
>
> Ok, I've put them in /MPlayer/incoming/AMR-WB
Thanks for the files. I get
vitor at vitor:~$ tiny_psnr orig_pod.wav ref_pod.wav 2 -190
stddev: 648.78 PSNR: 40.09 MAXDIFF:10660 bytes: 76480660/ 76480450
vitor at vitor:~$ tiny_psnr orig_pod.wav soc_pod.wav 2 -190
stddev: 643.05 PSNR: 40.16 MAXDIFF:10705 bytes: 76480660/ 76480450
which does settle the quality issue for me :D
-Vitor
More information about the FFmpeg-soc
mailing list