[FFmpeg-devel] [PATCH 2/2] mips: Optimization of AC3 FP encoder and EAC3 FP decoder
Vitor Sessak
vitor1001 at gmail.com
Wed Oct 17 19:07:54 CEST 2012
Hi!
On 10/16/2012 04:26 PM, Nedeljko Babic wrote:
> Signed-off-by: Nedeljko Babic <nbabic at mips.com>
> ---
> --- a/libavcodec/ac3enc.h
> +++ b/libavcodec/ac3enc.h
> @@ -256,6 +256,8 @@ typedef struct AC3EncodeContext {
> /* fixed vs. float templated function pointers */
> int (*allocate_sample_buffers)(struct AC3EncodeContext *s);
>
> + void (*apply_mdct)(struct AC3EncodeContext *s);
Strange. C code for apply_mdct() is basically just calling a
DSPContext.apply_window() inside a loop. So why do you need to make an
ASM version out of it instead of just writing a MIPS version of
apply_window()? Is the function call is too expensive?
> +static void ff_ac3_float_apply_channel_coupling_mips(AC3EncodeContext *s)
> +{
> + LOCAL_ALIGNED_16(CoefType, cpl_coords, [AC3_MAX_BLOCKS], [AC3_MAX_CHANNELS][16]);
> + LOCAL_ALIGNED_16(int32_t, fixed_cpl_coords, [AC3_MAX_BLOCKS], [AC3_MAX_CHANNELS][16]);
> + int blk, ch, bnd, i, j;
> + CoefSumType energy[AC3_MAX_BLOCKS][AC3_MAX_CHANNELS][16] = {{{0}}};
> + int cpl_start, num_cpl_coefs;
> + int32_t *dst;
> + const float *src;
> + unsigned int len;
> + uint8_t *exp;
> + float scale = 1 << 24;
> + float src0, src1, src2, src3, src4, src5, src6, src7;
> + int temp0, temp1, temp2, temp3, temp4, temp5, temp6, temp7;
> + int e,v;
Wow, this is a pretty complex function and optimizing it this way makes
for a lot of code duplication. Can't you just extract the time-consuming
parts to some DSP function?
-Vitor
More information about the ffmpeg-devel
mailing list