[FFmpeg-devel] [PATCH] x86/swr: add ff_float_to_int32_a_avx2

James Almer jamrial at gmail.com
Fri Nov 7 19:06:18 CET 2014


On 07/11/14 2:56 PM, Michael Niedermayer wrote:
> On Fri, Nov 07, 2014 at 01:19:22PM -0300, James Almer wrote:
>> On 07/11/14 6:05 AM, Christophe Gisquet wrote:
>>> Hi,
>>>
>>> 2014-11-06 23:04 GMT+01:00 James Almer <jamrial at gmail.com>:
>>>> No, the function checks for alignment and jumps to a branch that uses movdqu if needed.
>>>> ff_int32_to_float_a_avx also uses ymm regs and this same macro.
>>>
>>> OK, so nothing new here, same 32-bytes alignment.
>>>
>>>> when "mulps m0, m1, [mem]" would work just as well regardless of alignment.
>>>
>>> It does with AVX? That shows I never used it...
>>
>> http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-vol-1-manual.pdf
>> Chapter 14.9. It's pretty convenient, but taking advantage of it would require uglyfing macros 
>> or making the avx functions separate.
>>
>>>
>>>> If you use "cmpps m0, m1, 5" it will work for non-VEX coding, but error out otherwise
>>>> since x86inc.asm turns that into "vcmpps m0, m1, 5" instead of "vcmpps m0, m0, m1, 5"
>>>>
>>>> With aliases like cmpnltps it doesn't even add the "v" prefix.
>>>
>>> OK. As you are the number one developer for AVX, it's up to you
>>> whether that would need a more generic fix :-)
>>>
>>> Someone actually used to AVX should comment, but this patch looks so
>>> simple that you should apply it tomorrow if nobody objects.
>>>
>>
>> I was waiting for Michael to comment since he's the swr maintainer. Otherwise I'd have pushed 
>> it after your ok.
> 
> please push patches which Christophe reviewed
> no need to wait for me

Pushed then. Thanks to both.


More information about the ffmpeg-devel mailing list