[FFmpeg-devel] [PATCH 2/2] vf_colorspace: x86-64 SIMD (SSE2) optimizations.

Ronald S. Bultje rsbultje at gmail.com
Tue Apr 12 22:45:30 CEST 2016


Hi,

On Thu, Apr 7, 2016 at 10:05 AM, Kieran Kunhya <kierank at obe.tv> wrote:
>
> On Wed, 6 Apr 2016 at 19:10 Ronald S. Bultje <rsbultje at gmail.com> wrote:
>
>> +pw_1: times 8 dw 1
>> +pw_2: times 8 dw 2
>> +pw_4: times 8 dw 4
>> +pw_8: times 8 dw 8
>> +pw_16: times 8 dw 16
>> +pw_64: times 8 dw 64
>> +pw_128: times 8 dw 128
>> +pw_256: times 8 dw 256
>> +pw_512: times 8 dw 512
>> +pw_1023: times 8 dw 1023
>> +pw_1024: times 8 dw 1024
>> +pw_2048: times 8 dw 2048
>> +pw_4095: times 8 dw 4095
>> +pw_8192: times 8 dw 8192
>> +pw_16384: times 8 dw 16384
>> +
>> +pd_1: times 4 dd 1
>> +pd_2: times 4 dd 2
>> +pd_128: times 4 dd 128
>> +pd_512: times 4 dd 512
>> +pd_2048: times 4 dd 2048
>> +pd_8192: times 4 dd 8192
>> +pd_32768: times 4 dd 32768
>> +pd_131072: times 4 dd 131072
>>
>>
> Don't we have these defined somewhere?
>

Yes, and I agree it should be fixed by introducing a constants.c or so in
lavfi. I'll work on that separately, since it probably touches half the
assembly files (rodata section only, of course) in lavfi

+    pmaxsw          m6, m11
>> +    pmaxsw          m8, m11
>> +    pminsw          m6, [pw_ %+ %%maxval]
>> +    pminsw          m8, [pw_ %+ %%maxval]
>>
>
> CLIPW
>

Fixed everywhere.


> Otherwise seems ok.
>

Pushed both patches, thanks for review!

Ronald


More information about the ffmpeg-devel mailing list