[FFmpeg-devel] [PATCH] Stop using _explicit atomic operations where not necessary.

James Almer jamrial at gmail.com
Sun Jun 6 15:17:07 EEST 2021


On 6/6/2021 8:39 AM, Andreas Rheinhardt wrote:
> James Almer:
>> On 6/5/2021 11:27 AM, Anton Khirnov wrote:
>>> Memory ordering constraints other than the default (sequentially
>>> consistent) can behave in very unintuitive and unexpected ways, and so
>>> should be avoided unless there is a strong (typically performance)
>>> reason for using them. This holds especially for memory_order_relaxed,
>>> which imposes no ordering constraints.
>>
>> Performance is important for FIFO, though. Could they use a laxer order
>> at least for the fetch_sub and fetch_add cases?
>>
>> Also, commit fed50c4304e, which made cpu_flags an atomic type, states
>> that ThreadSanitizer stopped reporting races on tests/cpu_init after
>> that change, so maybe memory order is not important for it, only atomicity.
> 
> This is true: There can by definition be no data race involving an
> atomic variable that is exclusively accessed via atomic operations: "The
> execution of a program contains a data race if it contains two
> conflicting actions in different threads, at least one of which is *not
> atomic*, and neither happens before the other." (C11, 5.1.2.4.25)
> 
> Therefore fed50c4304eecb352e29ce789cdb96ea84d6162f and
> a2a38b160620d91bc3f895dadc4501c589998b9c fixed data races (the latter
> also a race condition); using sequentially consistent ordering meanwhile
> brings no gain: It may provide synchronization between two threads
> calling e.g. av_get_cpu_flags(), but the only effect of this is that
> tsan might overlook buggy code that actually contains races. It notably
> does not really establish an ordering: If two threads call
> av_cpu_count() concurrently, then it is indetermined which of the two
> threads will call av_log (if it has not already been called), but with
> this patch it is certain that the second thread synchronizes with the
> first thread and sees everything that the first thread has done so far
> (which can mask bugs); and if we look at the test in
> libavutil/tests/cpu_init.c, then all the possibilities that existed with
> memory_order_relaxed also exist with memory_order_seq_cst:
>   - Thread 1 can read first, see that cpu_flags is uninitialized and
> initialize it, followed by thread 2 seeing and reusing the already
> initialized value.
>   - Thread 1 can read first, followed by thread 2 reading the value (with
> both threads seeing that cpu_flags is uninitialized), followed by thread
> 1 storing its value, followed by thread 2 overwriting the value again;
> the order of writes can also be the other way around.
>   - Same as the last one, but with the order of stores reversed.
>   - Exchanging the roles of thread 1 and thread 2 in the above scenarios
> leads to the remaining cases.
> 
>> In any case these cpu and mem functions are hardly a bottleneck
>> anywhere, and rarely called outside of debugging, so this change is ok.
>>
> Mem functions are rarely called outside of debugging?

I was specifically thinking about about av_max_alloc(), but overlooked 
the fact max_alloc_size is read by av_malloc() an co, which are 
obviously not for debug only. Sorry.


More information about the ffmpeg-devel mailing list