[FFmpeg-devel] [PATCH] cpu: add a function for querying maximum required data alignment

James Almer jamrial at gmail.com
Sat Sep 2 21:48:40 EEST 2017


On 9/2/2017 3:29 PM, Clément Bœsch wrote:
> On Sat, Sep 02, 2017 at 02:07:01PM -0300, James Almer wrote:
> [...]
>> +size_t av_cpu_max_align(void)
>> +{
>> +    int av_unused flags = av_get_cpu_flags();
>> +
>> +#if ARCH_ARM || ARCH_AARCH64
>> +    if (flags & AV_CPU_FLAG_NEON)
>> +        return 16;
>> +#elif ARCH_PPC
>> +    if (flags & AV_CPU_FLAG_ALTIVEC)
>> +        return 16;
> 
>> +#elif ARCH_X86
>> +    if (flags & AV_CPU_FLAG_AVX)
>> +        return 32;
>> +    if (flags & AV_CPU_FLAG_SSE)
>> +        return 16;
>> +#endif
> 
> mmh, will this really work in FFmpeg? I think we have a difference related
> to the flags dependency. Typically, if having SSE2 doesn't imply you have
> SSE. I think you may want to extend the mask.

Mmh, you're right, forgot we have av_parse_cpu_caps().

What do i do then? Define two masks with all the CPU flags that would
apply for each alignment value?
AVX to AVX2 plus FMA3/4 and the slow variants for 32, then SSE to SSE4
plus XOP and the slow variants for 16?

> 
> [...]
> 
> 
> 
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 



More information about the ffmpeg-devel mailing list