[FFmpeg-devel] [PATCH] SIMD-optimized exponent_min() for ac3enc
Frank Barchard
fbarchard
Sat Jan 15 07:50:42 CET 2011
On Fri, Jan 14, 2011 at 10:32 PM, Loren Merritt <lorenm at u.washington.edu>wrote:
> On Fri, 14 Jan 2011, Justin Ruggles wrote:
>
> + /* round up to even multiple of 16 */
>> + if (nb_coefs & 15)
>> + nb_coefs = (nb_coefs & ~15) + 16;
>>
>
> unconditional
> nb_coefs = FFALIGN(nb_coefs, 16);
>
Loren is right. But FYI if you do it yourself, its
nb_coefs = (nb_coefs + 15) & ~15;
>
> +%macro AC3_EXPONENT_MIN 1
>> +cglobal ac3_exponent_min_%1, 3,4,3, exp, reuse_blks, offset, offset1
>> + cmp reuse_blksq, 0
>> + je .end
>> + sal reuse_blksq, 8
>> + sub offsetq, mmsize
>> +.nextexp:
>> + mov offset1q, offsetq
>> + add offset1q, reuse_blksq
>>
>
> lea
>
> + mova m0, [expq+offsetq]
>> +.nextblk:
>> + mova m1, [expq+offset1q]
>> +%ifidn %1, mmx
>> + PMINUB_MMX m0, m1, m2
>> +%else ; mmxext/sse/sse2
>> + pminub m0, m1
>>
>
> memory arg
>
> +%endif
>> + sub offset1q, 256
>> + cmp offset1q, offsetq
>>
>
> It is usually possible to arrange your pointers such that a loop ends with
> an offset of 0, and then you can take the flags from the add/sub instead of
> a separate cmp.
>
Or check for underflow. ie jns
sub offset1q, 256
js next
top:
...
sub offset1q, 256
jns top
next:
> + jne .nextblk
>> + mova [expq+offsetq], m0
>> + sub offsetq, mmsize
>> + jge .nextexp
>>
>
use unsigned cc if you can. It fusses on more cpus and does not use the
overflow condition.
jae nextexp
> +.end:
>> + REP_RET
>> +%endmacro
>> +
>> +INIT_MMX
>> +AC3_EXPONENT_MIN mmx
>> +AC3_EXPONENT_MIN sse_mmxext
>>
>
> mmx2 is a subset of sse; nothing should ever be tagged with both. In this
> case, you're not using sse.
>
> +%macro PMINUB_MMX 3 ; dst, src, tmp
>> + mova %3, %1
>> + pcmpgtb %1, %2
>> + pand %2, %1
>> + pandn %1, %3
>> + por %1, %2
>> +%endmacro
>>
>
> I think you can simplify that using psubusb.
>
> --Loren Merritt
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at mplayerhq.hu
> https://lists.mplayerhq.hu/mailman/listinfo/ffmpeg-devel
>
More information about the ffmpeg-devel
mailing list