[FFmpeg-devel] [FFmpeg-cvslog] x86/intreadwrite: use intrinsics instead of inline asm for AV_COPY128

James Almer jamrial at gmail.com
Thu Jul 11 20:53:05 EEST 2024


On 7/11/2024 10:54 AM, Martin Storsjö wrote:
> On Thu, 11 Jul 2024, Martin Storsjö wrote:
> 
>> On Thu, 11 Jul 2024, James Almer wrote:
>>
>>> On 7/11/2024 10:08 AM, Martin Storsjö wrote:
>>>> On Wed, 10 Jul 2024, James Almer wrote:
>>>>
>>>>> ffmpeg | branch: master | James Almer <jamrial at gmail.com> | Wed Jul 
>>>>> 10 13:00:20 2024 -0300| [bd1bcb07e0f29c135103a402d71b343a09ad1690] 
>>>>> | committer: James Almer
>>>>>
>>>>> x86/intreadwrite: use intrinsics instead of inline asm for AV_COPY128
>>>>>
>>>>> This has the benefit of removing any SSE -> AVX penalty that may 
>>>>> happen when
>>>>> the compiler emits VEX encoded instructions.
>>>>>
>>>>> Signed-off-by: James Almer <jamrial at gmail.com>
>>>>>
>>>>>>
>>> http://git.videolan.org/gitweb.cgi/ffmpeg.git/?a=commit;h=bd1bcb07e0f29c135103a402d71b343a09ad1690
>>>>> ---
>>>>>
>>>>> configure                    |  5 ++++-
>>>>> libavutil/x86/intreadwrite.h | 20 +++++++-------------
>>>>> 2 files changed, 11 insertions(+), 14 deletions(-)
>>>>>
>>>>> diff --git a/configure b/configure
>>>>> index f84fefeaab..7151ed1de3 100755
>>>>> --- a/configure
>>>>> +++ b/configure
>>>>> @@ -2314,6 +2314,7 @@ HEADERS_LIST="
>>>>>
>>>>> INTRINSICS_LIST="
>>>>>     intrinsics_neon
>>>>> +    intrinsics_sse
>>>>>     intrinsics_sse2
>>>>> "
>>>>>
>>>>> @@ -2744,7 +2745,8 @@ armv6t2_deps="arm"
>>>>> armv8_deps="aarch64"
>>>>> neon_deps_any="aarch64 arm"
>>>>> intrinsics_neon_deps="neon"
>>>>> -intrinsics_sse2_deps="sse2"
>>>>> +intrinsics_sse_deps="sse"
>>>>> +intrinsics_sse2_deps="sse2 intrinsics_sse"
>>>>> vfp_deps="arm"
>>>>> vfpv3_deps="vfp"
>>>>> setend_deps="arm"
>>>>> @@ -6446,6 +6448,7 @@ elif enabled loongarch; then
>>>>> fi
>>>>>
>>>>> check_cc intrinsics_neon arm_neon.h "int16x8_t test = vdupq_n_s16(0)"
>>>>> +check_cc intrinsics_sse immintrin.h "__m128 test = _mm_setzero_ps()"
>>>>> check_cc intrinsics_sse2 emmintrin.h "__m128i test = 
>>> _mm_setzero_si128()"
>>>>>
>>>>> check_ldflags -Wl,--as-needed
>>>>> diff --git a/libavutil/x86/intreadwrite.h 
>>> b/libavutil/x86/intreadwrite.h
>>>>> index 9bbef00dba..6546eb016c 100644
>>>>> --- a/libavutil/x86/intreadwrite.h
>>>>> +++ b/libavutil/x86/intreadwrite.h
>>>>> @@ -22,29 +22,25 @@
>>>>> #define AVUTIL_X86_INTREADWRITE_H
>>>>>
>>>>> #include <stdint.h>
>>>>> +#if HAVE_INTRINSICS_SSE
>>>>> +#include <immintrin.h>
>>>>> +#endif
>>>>
>>>> This change seems to have broken builds for x86 with Clang 16 or 
>>>> newer. (Clang 15 and lower seems to be fine.)
>>>>
>>>> See e.g. 
>>> http://fate.ffmpeg.org/log.cgi?slot=i686-mingw32-clang-trunk&time=20240711035948&log=compile for an example of the error. The issue is that a clang internal intrinsics header contains "_mm_comige_sh(__m128h A,", i.e. a parameter with the name "A" (which toolchain provided headers shouldn't use). This clashes with libavcodec/huffuyv.h, which has a "#define A 3".
>>>>
>>>> This is obviously a Clang intrinsics header bug, but we can't fix 
>>>> the existing Clang 16-18 releases that are out there, so I guess 
>>>> what we 
>>> can
>>>> do is change our "define A" to something more elaborate. (IIRC there 
>>> are
>>>> some similar issues with names with ncurses and/or android headers 
>>> too.)
>>>>
>>>> // Martin
>>>
>>> We also do "#define A AV_OPT_FLAG_AUDIO_PARAM" in options_table.h and 
>>> probably other places, so changing huffyuv.h may not be enough.
>>
>> That's quite possible, but those cases may be including intreadwrite.h 
>> before that, do it's possible it might not trigger there.
>>
>> I'll see how many places need to be changed here.
> 
> huffyuvenc.c seems to be the only file that runs into the issue; it 
> includes put_bits.h (which brings in libavutil/intreadwrite.h) after 
> huffyuv.h.

Can we move the huffyuv.h include right after put_bits.h then? I 
personally prefer that over changing the RGBA defines to workaround a 
compiler bug.


More information about the ffmpeg-devel mailing list