[FFmpeg-devel] [PATCH] Don't needlessly reinitialize ff_cos_## tables.

Michael Niedermayer michael at niedermayer.cc
Sun Oct 25 00:45:00 CEST 2015


On Sat, Oct 24, 2015 at 03:10:58PM +0200, wm4 wrote:
> On Sat, 24 Oct 2015 04:13:18 +0200
> Michael Niedermayer <michael at niedermayer.cc> wrote:
> 
> > On Thu, Oct 22, 2015 at 09:48:30PM -0700, Dale Curtis wrote:
> > > On Wed, Oct 21, 2015 at 6:52 PM, Dale Curtis <dalecurtis at chromium.org>
> > > wrote:
> > >   
> > > > On Tue, Oct 20, 2015 at 11:50 PM, Michael Niedermayer <  
> > > > michael at niedermayer.cc> wrote:
> > > >>
> > > >> the last element to be written should be checked, so that if
> > > >> initialization is done by 2 threads at the same time, neither can
> > > >> return from this function without initialization having finished
> > > >>
> > > >> also the race detectors are broken if they complain about cases where
> > > >> a variable that has value a is set to value a, that cannot be part of
> > > >> a race, not even if a is written byte per byte instead of atomically
> > > >> unless theres something iam missing
> > > >> Is this something that can be fixed or disabled on the side of the
> > > >> race detectors?
> > > >> It might reduce false positives in FFmpeg and potentially other
> > > >> tools.
> > > >>  
> > > >
> > > > We can suppress it, which I think is more reasonable then the overhead
> > > > it'd take to make this "race" go away. I notice the sin tables are
> > > > initialized within the fft context so there's no "race." Is there a reason
> > > > the cosine tables aren't done this way?
> > > >  
> > > 
> > > Actually it looks like sin may suffer from the same issues -- it doesn't
> > > make a copy like I was thinking it did. One of the TSAN folk detailed why
> > > suppression isn't a good idea here  
> > 
> > > https://codereview.chromium.org/1412123007/#msg7 -- which roughly says we
> > > can't rely on the compiler to do the right thing here.  
> > 
> > quite independant of the sin/cos discussion here, This statement is
> > IMHO somewhat problematic
> > It may be strictly true in a language lawyer sense but
> > 
> > Lets assume that compilers did add random writes into writable arrays
> > as long as they restore it before the next function call or return.
> > like it seems assumed in the linked discussion
> > 
> > Frame multithreading has one thread decoding a frame and the next
> > using that frame after it has been written, the individual accesses
> > are not protected by mutexes nor could they, it would defeat the
> > idea of parallel decoding ...
> > Now if the compiler could randomly add writes that exist nowhere in
> > the C code as long as the thread itself doesnt access it at that time,
> > then this would break.
> 
> All what matters is if there are memory barriers between the accesses.
> So for example if one thread is done with decoding a frame, and is
> handed off to other threads as reference frame, then the
> synchronization that takes care of this "handing off" will perform the
> required memory barrier. Or am I getting this wrong?

in practice i belive its all fine, this is theoretical but
I think there are multiple issues
one would be simply that thread 1 which writes frame 1 might never
access it again before deallocation thus the compiler could optimize
the whole decode out, free the frame earlier or use the memory for
something else, i dont think this is a real problem but iam not aware
of anything in the C spec that would disallow it.
This is similar to the linked text where tab[] was used by a compiler
to hold a intermediate value which in no way or form was written
into the array by the C code
That is the compiler in that example added a write with a value
where there was no write with such value in the C code once the
compiler proofed that the function itself in its current context
doesnt need the arrays value at that point
in the frame multithreading, the frames too can be similarly
unused by the thread itself so by the same chain of reasoning the
compiler too could add writes as it likes.
I doubt this is a real issue in either case but i dont know

The "proper solution" would be to make the frame volatile but that
cannot be done as it would de-optimize the code

and iam quite sure as this is a complex subject there are also details
that iam missing

memory barries are needed too, and should be done internally by the
pthread calls used to synchronize the progress state

either way, this subject is rather uninterresting ill try to do
something more usefull for FFmpeg than arguing about these 
hypothetical cases ...

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

I am the wisest man alive, for I know one thing, and that is that I know
nothing. -- Socrates
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: Digital signature
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20151025/db53d068/attachment.sig>


More information about the ffmpeg-devel mailing list