[FFmpeg-devel] [PATCH 2/2] aacenc: add SIMD optimizations for abs_pow34 and quantization

Sat Oct 8 19:18:36 EEST 2016

On Sat, Oct 8, 2016 at 5:20 PM, Rostislav Pehlivanov
<atomnuker at gmail.com> wrote:
> +cglobal aac_quantize_bands, 8, 8, 7, out, in, scaled, size, Q34, is_signed, maxval, rounding
[...]
> +    movd    m4, is_signedd

movd is SSE2. Can be worked around by moving it through the stack though.

[...]

> +    /* Can't pass floats to external assembly directly */                                             \
> +    ff_aac_quantize_bands_ ## SET(out, in, scaled, size, (const float [RSIZE_BYTES]){Q34},            \
> +                                  is_signed, (const float [RSIZE_BYTES]){(float)maxval},              \
> +                                  (const float [RSIZE_BYTES]){rounding});                             \

If you reorder the function arguments so that the floating-point ones
are at the end (should preferably be done as a separate patch though)
it'd be fairly easy to handle it directly in assembly instead with
something like this (untested):

;void ff_aac_quantize_bands_sse(int *out, const float *in, const float *scaled,
;                               int size, int is_signed, int maxval,
;                               float Q34, float rounding)
;*******************************************************************
INIT_XMM sse
cglobal aac_quantize_bands, 5, 5, 7, out, in, scaled, size, is_signed,
maxval, Q34, rounding
%if WIN64 || ARCH_X86_32
    movss    m0, Q34m
    movss    m1, roundingm
%endif
    cvtsi2ss m2, dword maxvalm