[FFmpeg-devel] [PATCH] atrac decoder
Benjamin Larsson
banan
Fri Aug 14 23:14:30 CEST 2009
Michael Niedermayer wrote:
> On Wed, Aug 12, 2009 at 08:45:55PM +0200, Benjamin Larsson wrote:
>> Michael Niedermayer wrote:
>>
>>> if the groups are a power of 2 you can do
>>> 0123456789ABCDEF
>>> ^3
>>> 32107654...
>>>
>>> thats just a ^C in the index and can be omited when filling whole
>>> blocks of size C with zeros
>>>
>> When implemented this code
>>
>> pos = su->bsm[band_num] ? bfu_start_short[bfu_num] :
>> bfu_start_long[bfu_num];
>>
>> for (i=0 ; i<num_specs ; i++) {
>> /* read in a quantized spec and convert it to signed and then inverse
>> quantization */
>> spec[pos + i] = get_sbits(gb, word_len) *
>> sf_tab[su->idsfs[bfu_num]] * max_quant;
>>
>>
>> becomes this code
>>
>> if (su->bsm[band_num]) {
>> /* get the position of the 1st spec according to the block size mode */
>> pos = bfu_start_short[bfu_num];
>> for (i=0 ; i<num_specs ; i++) {
>> int j = band_num ? ((pos&~31)+((pos+i)^31)) : pos +i;
>> spec[j] = get_sbits(gb, word_len) *
>> sf_tab[su->idsfs[bfu_num]] * max_quant;
>> }
>> } else {
>> /* get the position of the 1st spec according to the block size mode */
>> pos = bfu_start_long[bfu_num];
>> for (i=0 ; i<num_specs ; i++) {
>> int j = band_num ? 511 - (pos +i) : pos + i;
>> /* read in a quantized spec and convert it to signed and then
>> inverse quantization */
>> spec[j] = get_sbits(gb, word_len) *
>> sf_tab[su->idsfs[bfu_num]] * max_quant;
>> }
>> }
>
>
> cant the loops be merged with a ^variable? instead of constants
There are 2 conditionals needed to figure out how to reorder.
su->bsm[band_num] and band_num resulting in 4(3) ways to calculate the
index. Currently I don't see a way to merge it.
> btw, instead of checking for the first band, it could be reordered
> later, i assume that would get rid of most reordering and still keep
> the code simple and fast
>
>
I benchmarked now and calculating the reverse index with the current
code is slightly slower 281631 vs 279605(swap buffer) dezicycles.
I also did a test with merging the loops and only used one conditional,
it resulted in 280491 dezicycles so not faster compared to using the
swap method. When changed that conditional to a simple xor ((pos+i)^511)
I finally got lower dezicycles count compared to the swap buffer method
(279129 vs 279605). But the xor only serves as a limit to how much
faster a working xor method could be, firstly the xor method doesn't
work as is and I don't know if it would be possible to make it work and
secondly it isn't much faster (~0.2%). So by that I conclude that doing
the reorder before imdct would be as fast as during unpack on my
computer. The index calculation is far to expensive to out weight the
cost of swapping the buffer.
>>
>> I have some bug in there but the code would look something like that. Do
>> you prefer this solution ?
>
> speed is what matters ...
> if its slower i surely dont prefer it ...
>
Ok, I'll use the faster code then.
MvH
Benjamin Larsson
More information about the ffmpeg-devel
mailing list