[FFmpeg-devel] [PATCH] x86: use new gcc atomic built-ins if available

Michael Niedermayer michaelni at gmx.at
Mon Oct 27 20:33:10 CET 2014


On Sat, Oct 25, 2014 at 10:32:57PM -0300, James Almer wrote:
> __sync built-ins are considered legacy and will be deprecated.
> These new memory model aware built-ins have been available since GCC 4.7.0
> 
> Signed-off-by: James Almer <jamrial at gmail.com>
> ---
> https://gcc.gnu.org/onlinedocs/gcc-4.9.0/gcc/_005f_005fatomic-Builtins.html
> This is an RFC for a couple reasons.
> 
> The first is the memory model parameter. The documentation mentions that the 
> __sync functions match the behavoir of the new __atomic functions when the 
> latter use the full barrier model (__ATOMIC_SEQ_CST), so i went with it for 
> consistency's sake. It may however be a good idea to check if any of the more 
> relaxed models available for these new functions can be used instead.
> It's worth mentioning that when i tested, gcc-tsan liked the __atomic load and 
> store functions a lot more than __sync_synchronize(), regardless of memory 
> model.
> 
> The second reason is __atomic_compare_exchange_n(), and how it differs from
> __sync_val_compare_and_swap().
> While the latter returns *ptr as it was before the operation, the former
> doesn't and instead copies *ptr to oldval if the result of the comparison is 
> false. This means that returning oldval will match the old behavoir without 
> having to change the wrapper.
> A disassemble example from libavutil/buffer.o however hints that the __atomic
> function may be slower because of it writting oldval.
> 
> __sync_val_compare_and_swap:
>  8e3:	48 89 d8             	mov    rax,rbx
>  8e6:	f0 48 0f b1 16       	lock cmpxchg QWORD PTR [rsi],rdx
>  8eb:	48 85 c0             	test   rax,rax
> 
> __atomic_compare_exchange_n:
>  8f0:	48 8d 4c 24 20       	lea    rcx,[rsp+0x20]
>  [...]
>  90c:	48 89 d8             	mov    rax,rbx
>  90f:	48 89 5c 24 20       	mov    QWORD PTR [rsp+0x20],rbx
>  914:	f0 48 0f b1 16       	lock cmpxchg QWORD PTR [rsi],rdx
>  919:	74 03                	je     91e <av_buffer_pool_get+0x3e>
>  91b:	48 89 01             	mov    QWORD PTR [rcx],rax
>  91e:	48 8b 44 24 20       	mov    rax,QWORD PTR [rsp+0x20]
>  923:	48 85 c0             	test   rax,rax
> 

> So the question is, do we keep using __sync_val_compare_and_swap as long as 
> gcc offers it (Which is probably a very long time), or immediately switch to 
> __atomic_compare_exchange_n if available?

id say we should favor whatever is faster


[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

The real ebay dictionary, page 2
"100% positive feedback" - "All either got their money back or didnt complain"
"Best seller ever, very honest" - "Seller refunded buyer after failed scam"
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 181 bytes
Desc: Digital signature
URL: <https://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20141027/06ec4329/attachment.asc>


More information about the ffmpeg-devel mailing list