[FFmpeg-devel] [PATCH] ARM: NEON optimised vector_fmul

Laurent Desnogues laurent.desnogues
Tue Aug 26 12:46:58 CEST 2008


On Mon, Aug 25, 2008 at 5:06 AM, Mans Rullgard <mans at mansr.com> wrote:
> ---
>  libavcodec/armv4l/dsputil_neon.c   |    2 ++
>  libavcodec/armv4l/dsputil_neon_s.S |   17 +++++++++++++++++
>  2 files changed, 19 insertions(+), 0 deletions(-)
>
[...]
> diff --git a/libavcodec/armv4l/dsputil_neon_s.S b/libavcodec/armv4l/dsputil_neon_s.S
> index e4b809e..d1bdba1 100644
> --- a/libavcodec/armv4l/dsputil_neon_s.S
> +++ b/libavcodec/armv4l/dsputil_neon_s.S
> @@ -324,6 +324,23 @@ extern ff_float_to_int16_interleave_neon
>         pop           {r4,r5,pc}
>         .endfunc
>
> +extern ff_vector_fmul_neon
> +        mov           r3, r0
> +        vld1.64       {d0-d3}, [r0,:128]!
> +        vld1.64       {d4-d7}, [r1,:128]!
> +        dmb

Shouldn't the dmb be replaced with a macro depending on
Cortex-A8 revision?


Laurent

> +1:      subs          r2, r2, #8
> +        vmul.f32      q8, q0, q2
> +        vmul.f32      q9, q1, q3
> +        beq           2f
> +        vld1.64       {d0-d3},   [r0,:128]!
> +        vld1.64       {d4-d7},   [r1,:128]!
> +        vst1.64       {d16-d19}, [r3,:128]!
> +        b             1b
> +2:      vst1.64       {d16-d19}, [r3,:128]!
> +        bx            lr
> +        .endfunc
> +
>  extern ff_vector_fmul_window_neon
>         vld1.32       {d16[],d17[]}, [sp,:32]
>         push          {r4,r5,lr}
> --
> 1.6.0
>
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at mplayerhq.hu
> https://lists.mplayerhq.hu/mailman/listinfo/ffmpeg-devel
>




More information about the ffmpeg-devel mailing list