[FFmpeg-devel] [PATCH 5/6] truehd: break out part of output_data into platform-specific callback.

James Almer jamrial at gmail.com
Wed Mar 19 21:53:59 CET 2014


On 19/03/14 4:05 PM, Ben Avison wrote:
> On Wed, 19 Mar 2014 18:27:54 -0000, James Almer <jamrial at gmail.com> wrote:
>>> +    int32_t (*mlp_pack_output)(int32_t lossless_check_data,
>>> +                               int32_t (*sample_buffer)[MAX_CHANNELS],
>>> +                               void *data,
>>> +                               uint16_t blockpos,
>>> +                               uint8_t max_matrix_channel,
>>> +                               int is32,
>>> +                               uint8_t *ch_assign,
>>> +                               int8_t *output_shift);
>>>  } MLPDSPContext;
>>
>> As i said elsewhere, please put pointers first if possible, like you did
>> for mlp_rematrix_channel.
> 
> OK, some reordering is possible, but putting all the pointers first would
> add some cycles. Some of the ordering was a side-effect of how it was
> convenient to write the assembly, although obviously what's convenient
> will vary from platform to platform. In particular:
> 
> * lossless_check_data is a kind of accumulator: it's passed in, processed
>   a bit, and passed out as the function return value. Having it as the
>   first argument means no register shuffling is needed at runtime.
> 
> * max_matrix_channel, is32, ch_assign, output_shift: I don't particularly
>   care about the relative order of these amongst themselves. However, 3
>   or 4 of them are actually unused by all the individual implementations
>   of the function. Having them at the 5th and subsequent arguments means
>   (with the ARM ABI) that they're on the stack, so for each of the
>   unused arguments, we save a word load. If lossless_check_data and
>   blockpos came after all the pointers, they'd both be on the stack
>   instead, and they would need loading on every call.
> 
> Would an order like
> 
> int32_t lossless_check_data,
> int32_t (*sample_buffer)[MAX_CHANNELS],
> void *data,
> uint16_t blockpos,
> uint8_t *ch_assign,
> int8_t *output_shift,
> uint8_t max_matrix_channel,
> int is32
> 
> be more acceptable? Or perhaps
> 
> int32_t lossless_check_data,
> uint16_t blockpos,
> int32_t (*sample_buffer)[MAX_CHANNELS],
> void *data,
> uint8_t *ch_assign,
> int8_t *output_shift,
> uint8_t max_matrix_channel,
> int is32
> 
> where all the pointers are grouped together?
> 
> Ben

Both are ok. As long as the last parameter is not a pointer, potential 
future x86_32 implementations (at least made using yasm and x86inc) should 
be fine.

If this means a big hit in performance for ARM then i guess you can keep 
things as you originally wrote them.
Not sure how likely it is for someone to write asm for x86_32 anyway, and 
i guess this could be changed in the future if needed.

PS: Sending this again to the list this time as i mistakenly sent it to 
your address only the first time.


More information about the ffmpeg-devel mailing list