[FFmpeg-devel] [PATCH] AAC decoder

Robert Swain robert.swain
Fri May 23 14:59:41 CEST 2008


2008/5/23 Robert Swain <robert.swain at gmail.com>:
> 2008/5/23 Robert Swain <robert.swain at gmail.com>:
>> 2008/5/23 Robert Swain <robert.swain at gmail.com>:
>>> 2008/4/2 Michael Niedermayer <michaelni at gmx.at>:
>>>> On Tue, Apr 01, 2008 at 04:56:48PM +0200, Andreas ?man wrote:
>>>>> Andreas ?man wrote:
>>>
>>> [...]
>>>
>>>>> +static inline float ivquant(AACContext * ac, int a) {
>>>>> +    static const float sign[2] = { -1., 1. };
>>>>> +    int tmp = (a>>31);
>>>>> +    int abs_a = (a^tmp)-tmp;
>>>>> +    if (abs_a < sizeof(ac->ivquant_tab)/sizeof(ac->ivquant_tab[0]))
>>>>> +        return sign[tmp+1] * ac->ivquant_tab[abs_a];
>>>>
>>>> What is the point of the sign splitout? it seems that it would be simpler
>>>> to have that in teh table as well
>>>
>>> Kostya is in favour of removing the ivquant_tab table because it
>>> caches only a small number of possible values and its general impact
>>> on decoding speedup is not obvious.
>>>
>>> Attached is a patch that removes the ivquant_tab table and simplifies
>>> and moves the ivquant() functionality into the calling loop and
>>> removes the ivquant() function altogether as it isn't really needed to
>>> wrap pow().
>>
>> Oops! sign * pow(abs(a), 4./3) != pow(a, 4./3) . Fixed patch attached
>> with bit magic returned.
>>
>> I'll do some benchmarks too, just for good measure.
>
> Actually, I won't. abs(a) is normally <16 but can be up to 4351 (if I
> understood the escape sequence decoding).
>
> http://article.gmane.org/gmane.comp.video.ffmpeg.soc/2002/match=ivquant
>
> Andreas just did another benchmark and this has a large impact on decoding time.
>
> I'll look at merging the sign into the table as originally suggested.
> I suspect the table size can be reduced to 16 (vs 256) with little to
> no impact on speed. Ignore the patch for the moment. Sorry for the
> noise.

Well, I've done it but I'm not really convinced by the results. See
attached patch.

I tested on an FAAC-encoded South Park episode:

new size 64

13310 dezicycles in ivquant, 1 runs, 0 skips
7975 dezicycles in ivquant, 2 runs, 0 skips
4867 dezicycles in ivquant, 4 runs, 0 skips
3286 dezicycles in ivquant, 8 runs, 0 skips
2674 dezicycles in ivquant, 16 runs, 0 skips
3172 dezicycles in ivquant, 32 runs, 0 skips
2956 dezicycles in ivquant, 64 runs, 0 skips
2860 dezicycles in ivquant, 128 runs, 0 skips
2856 dezicycles in ivquant, 256 runs, 0 skips
2890 dezicycles in ivquant, 511 runs, 1 skips
2871 dezicycles in ivquant, 1023 runs, 1 skips
2946 dezicycles in ivquant, 2046 runs, 2 skips
3094 dezicycles in ivquant, 4094 runs, 2 skips
2988 dezicycles in ivquant, 8188 runs, 4 skips
3377 dezicycles in ivquant, 16379 runs, 5 skips
3652 dezicycles in ivquant, 32758 runs, 10 skips
3818 dezicycles in ivquant, 65522 runs, 14 skips
3982 dezicycles in ivquant, 131052 runs, 20 skips
4203 dezicycles in ivquant, 262107 runs, 37 skipsup=0 drop=0
4215 dezicycles in ivquant, 524209 runs, 79 skipsdup=0 drop=0
4191 dezicycles in ivquant, 1048410 runs, 166 skipsp=0 drop=0
4190 dezicycles in ivquant, 2096828 runs, 324 skipsup=0 drop=0

new size 128

7700 dezicycles in ivquant, 1 runs, 0 skips
5115 dezicycles in ivquant, 2 runs, 0 skips
3437 dezicycles in ivquant, 4 runs, 0 skips
2571 dezicycles in ivquant, 8 runs, 0 skips
2323 dezicycles in ivquant, 16 runs, 0 skips
2997 dezicycles in ivquant, 32 runs, 0 skips
2866 dezicycles in ivquant, 64 runs, 0 skips
2818 dezicycles in ivquant, 128 runs, 0 skips
2832 dezicycles in ivquant, 256 runs, 0 skips
2875 dezicycles in ivquant, 511 runs, 1 skips
2866 dezicycles in ivquant, 1023 runs, 1 skips
2859 dezicycles in ivquant, 2047 runs, 1 skips
2856 dezicycles in ivquant, 4095 runs, 1 skips
2869 dezicycles in ivquant, 8189 runs, 3 skips
2942 dezicycles in ivquant, 16379 runs, 5 skips
3436 dezicycles in ivquant, 32755 runs, 13 skips
3704 dezicycles in ivquant, 65520 runs, 16 skips
3925 dezicycles in ivquant, 131047 runs, 25 skips
4127 dezicycles in ivquant, 262090 runs, 54 skipsup=0 drop=0
4181 dezicycles in ivquant, 524199 runs, 89 skipsdup=0 drop=0
4168 dezicycles in ivquant, 1048415 runs, 161 skipsp=0 drop=0
4179 dezicycles in ivquant, 2096843 runs, 309 skipsup=0 drop=0

new size 256

7480 dezicycles in ivquant, 1 runs, 0 skips
5005 dezicycles in ivquant, 2 runs, 0 skips
3327 dezicycles in ivquant, 4 runs, 0 skips
2530 dezicycles in ivquant, 8 runs, 0 skips
2303 dezicycles in ivquant, 16 runs, 0 skips
2983 dezicycles in ivquant, 32 runs, 0 skips
2858 dezicycles in ivquant, 64 runs, 0 skips
2803 dezicycles in ivquant, 128 runs, 0 skips
2826 dezicycles in ivquant, 256 runs, 0 skips
2871 dezicycles in ivquant, 512 runs, 0 skips
2862 dezicycles in ivquant, 1024 runs, 0 skips
2860 dezicycles in ivquant, 2048 runs, 0 skips
2856 dezicycles in ivquant, 4096 runs, 0 skips
2869 dezicycles in ivquant, 8192 runs, 0 skips
2944 dezicycles in ivquant, 16384 runs, 0 skips
3510 dezicycles in ivquant, 32765 runs, 3 skips
3786 dezicycles in ivquant, 65525 runs, 11 skips
3967 dezicycles in ivquant, 131053 runs, 19 skips
4149 dezicycles in ivquant, 262109 runs, 35 skipsup=0 drop=0
4270 dezicycles in ivquant, 524224 runs, 64 skipsdup=0 drop=0
4224 dezicycles in ivquant, 1048488 runs, 88 skipsup=0 drop=0
4213 dezicycles in ivquant, 2097039 runs, 113 skipsup=0 drop=0

old size 256

5500 dezicycles in ivquant, 1 runs, 0 skips
3850 dezicycles in ivquant, 2 runs, 0 skips
2805 dezicycles in ivquant, 4 runs, 0 skips
2282 dezicycles in ivquant, 8 runs, 0 skips
2179 dezicycles in ivquant, 16 runs, 0 skips
2839 dezicycles in ivquant, 32 runs, 0 skips
2731 dezicycles in ivquant, 64 runs, 0 skips
2688 dezicycles in ivquant, 128 runs, 0 skips
2712 dezicycles in ivquant, 256 runs, 0 skips
2753 dezicycles in ivquant, 512 runs, 0 skips
2744 dezicycles in ivquant, 1024 runs, 0 skips
2738 dezicycles in ivquant, 2048 runs, 0 skips
2734 dezicycles in ivquant, 4096 runs, 0 skips
2747 dezicycles in ivquant, 8191 runs, 1 skips
2814 dezicycles in ivquant, 16382 runs, 2 skips
3266 dezicycles in ivquant, 32763 runs, 5 skips
3512 dezicycles in ivquant, 65526 runs, 10 skips
3716 dezicycles in ivquant, 131054 runs, 18 skips
3912 dezicycles in ivquant, 262107 runs, 37 skipsup=0 drop=0
3951 dezicycles in ivquant, 524196 runs, 92 skipsdup=0 drop=0
3940 dezicycles in ivquant, 1048421 runs, 155 skipsp=0 drop=0
3950 dezicycles in ivquant, 2096877 runs, 275 skipsup=0 drop=0

The new method with size 128 worked better than  on
http://samples.mplayerhq.hu/A-codecs/AAC/ct_faac.mp4 . If you want me
to test more, I can. If you have any suggestions for improvements,
they're very welcome.

Rob
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 20080523-1326-merge_sign_into_ivquant.diff
Type: text/x-diff
Size: 1636 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20080523/bac507ed/attachment.diff>



More information about the ffmpeg-devel mailing list