[FFmpeg-devel] [PATCH] ARM: NEON optimised vector_fmul
Laurent Desnogues
laurent.desnogues
Tue Aug 26 12:46:58 CEST 2008
On Mon, Aug 25, 2008 at 5:06 AM, Mans Rullgard <mans at mansr.com> wrote:
> ---
> libavcodec/armv4l/dsputil_neon.c | 2 ++
> libavcodec/armv4l/dsputil_neon_s.S | 17 +++++++++++++++++
> 2 files changed, 19 insertions(+), 0 deletions(-)
>
[...]
> diff --git a/libavcodec/armv4l/dsputil_neon_s.S b/libavcodec/armv4l/dsputil_neon_s.S
> index e4b809e..d1bdba1 100644
> --- a/libavcodec/armv4l/dsputil_neon_s.S
> +++ b/libavcodec/armv4l/dsputil_neon_s.S
> @@ -324,6 +324,23 @@ extern ff_float_to_int16_interleave_neon
> pop {r4,r5,pc}
> .endfunc
>
> +extern ff_vector_fmul_neon
> + mov r3, r0
> + vld1.64 {d0-d3}, [r0,:128]!
> + vld1.64 {d4-d7}, [r1,:128]!
> + dmb
Shouldn't the dmb be replaced with a macro depending on
Cortex-A8 revision?
Laurent
> +1: subs r2, r2, #8
> + vmul.f32 q8, q0, q2
> + vmul.f32 q9, q1, q3
> + beq 2f
> + vld1.64 {d0-d3}, [r0,:128]!
> + vld1.64 {d4-d7}, [r1,:128]!
> + vst1.64 {d16-d19}, [r3,:128]!
> + b 1b
> +2: vst1.64 {d16-d19}, [r3,:128]!
> + bx lr
> + .endfunc
> +
> extern ff_vector_fmul_window_neon
> vld1.32 {d16[],d17[]}, [sp,:32]
> push {r4,r5,lr}
> --
> 1.6.0
>
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at mplayerhq.hu
> https://lists.mplayerhq.hu/mailman/listinfo/ffmpeg-devel
>
More information about the ffmpeg-devel
mailing list