[FFmpeg-devel] [PATCH] SIMD-optimized float_to_int32_fmul_scalar()
Loren Merritt
lorenm
Fri Jan 7 19:49:55 CET 2011
On Fri, 7 Jan 2011, Justin Ruggles wrote:
> This patch implements float_to_int32_fmul_scalar() for 3dnow, sse, and
> sse2 and uses it in the AC3 encoder.
>@@ -2303,6 +2303,65 @@ static void int32_to_float_fmul_scalar_sse2(float *dst, const int *src, float mu
> );
> }
>
>+static void float_to_int32_fmul_scalar_3dnow(int32_t *dst, const float *src, float mul, int len)
>+{
>+ /* note: pf2id conversion uses truncation, not round-to-nearest */
>+ x86_reg i = (len-4)*4;
>+ __asm__ volatile(
>+ "movq %3, %%mm1 \n\t"
movd
>@@ -2910,6 +2971,8 @@ void dsputil_init_mmx(DSPContext* c, AVCodecContext *avctx)
> c->vector_fmul_add = vector_fmul_add_3dnow; // faster than sse
> if(mm_flags & AV_CPU_FLAG_SSE2){
> c->int32_to_float_fmul_scalar = int32_to_float_fmul_scalar_sse2;
>+ if (!(mm_flags & AV_CPU_FLAG_SSE2SLOW))
>+ c->float_to_int32_fmul_scalar = float_to_int32_fmul_scalar_sse2;
AV_CPU_FLAG_SSE2SLOW is an alternative to AV_CPU_FLAG_SSE2. They won't
both be set at once. It means "pentium-m's SSE2 is so slow that by default
we pretend it doesn't exist, and only make an exception if specifically
tested".
If you intended it to detect athlon64, then you picked the wrong flag, and
there isn't a right one yet.
--Loren Merritt
More information about the ffmpeg-devel
mailing list