[FFmpeg-devel] [PATCH 09/13] avcodec/svq1dec: clear MMX state after MB decode loop

Michael Niedermayer michael at niedermayer.cc
Sun Oct 23 23:33:34 EEST 2016

On Sun, Oct 23, 2016 at 01:02:01PM -0400, Ronald S. Bultje wrote:
> Hi,
> On Sat, Oct 22, 2016 at 11:36 PM, Michael Niedermayer <
> michael at niedermayer.cc> wrote:
> > On Sat, Oct 22, 2016 at 10:10:01PM -0400, Ronald S. Bultje wrote:
> > > Hi,
> > >
> > > general comment about all other dec patches.
> > >
> > > On Sat, Oct 22, 2016 at 3:02 PM, Michael Niedermayer
> > <michael at niedermayer.cc
> > > > wrote:
> > >
> > > > Signed-off-by: Michael Niedermayer <michael at niedermayer.cc>
> > > > ---
> > > >  libavcodec/svq1dec.c | 2 ++
> > > >  1 file changed, 2 insertions(+)
> > > >
> > > > diff --git a/libavcodec/svq1dec.c b/libavcodec/svq1dec.c
> > > > index 2b72e08..0fe222e 100644
> > > > --- a/libavcodec/svq1dec.c
> > > > +++ b/libavcodec/svq1dec.c
> > > > @@ -744,6 +744,7 @@ static int svq1_decode_frame(AVCodecContext
> > *avctx,
> > > > void *data,
> > > >              }
> > > >          }
> > > >      }
> > > > +    emms_c();
> > >
> > >
> > > This is hideous, you're sprinkling emms_c in various places to make some
> > > stupid test pass. The test is morbidly stupid and there is no general
> > > consensus on patterns to be followed as for where to place emms_c.
> > Someone
> > > who doesn't know any better will litter each new decoder with 10-20 calls
> > > to emms_c just because he found that other decoders do it in
> > undocumented,
> > > unexplained and unclear locations also.
> > >
> > > If you want this to be a "thing", you need to design and document
> > carefully
> > > where emms_c is necessary. Then come up with some system that makes this
> > > work by itself.
> >
> > Your mail sounds quite aggressive to me, did i say something to anger
> > you ? It was certainly not intended
> >
> > About this patchset
> > FFmpeg is broken ATM (both in theory and practice with some libcs),
> > the way MMX code is used is not correct, emms_c()
> > calls are missing and required. The obvious way to fix that
> > (in practice) is adding emms_c() calls where they are missing.
> > I can document why each call is needed, if thats wanted?
> Your representation of facts is strange, to say the least. Let's explore
> two related claims:
> - (A) ffmpeg is broken in practice when calling musl malloc/free functions
> after calling MMX SIMD functions on x86-32 (which crashes).
> - (B) ffmpeg is broken in theory because we don't clear FPU state (as
> required) at the end of MMX SIMD functions.
> Which are you trying to fix? You make it sound like you're fixing (B) (you
> use the plural "libcs" and often use the word "required" or "standard"),

All the crashes with musl are to the best of my knowledge FFmpeg
violating the C standard.
Iam interrested in fixing the cases where we violate the C standard,
with a priority toward the cases that cause practical issues.

musl really isnt involved in this, any C library can trigger these bugs
in FFmpeg hence plural.

> but your patchset only addresses a small subset of (A).

I just tried and fate fully passes on x86_32 with this patchset as
well as x86_64 both with the assert enabled
prior to this patchset there are 310 fate failures on x86_32.
The assert should check the case relevant for the musl case.
If something remains iam interrested in fixing it

> For example, you're
> not clearing FPU state at the end of a SIMD function if it's followed by a
> SIMD function that calls memcpy in the C implementation.

yes, my priority is at fixing the fpu state when calling memory
allocation functions from libc, because this class of issues is what
caused the crashes in practice

> But more importantly, in doing so (regardless whether you're trying to fix
> "A" or "B"), this patchset:
> - has no design (or at least, I can't find it);
> - it litters the code with unstructured calls to emms_c(), which seem
> specifically placed to fix make fate with an assert added in av_malloc/free;

Isnt this the same as with any memleak fix adding a *free(), littering
the code with said *free() like emms_c() litters the code when it is

> - it adds unnecessary (to fix "A") calls to emms_c() on x86-64;

The C standard mandates this, so it is not unnecessary. Also this
class of issues caused crashes with musl in practice.
It may be that some patch is not needed for fixing a practical issue
and only theoretical standard compliance but i doubt that also
even if why does this matter?
We want to fix this, i have certainly better to do than to test if
undefined behavor actually triggers a crash
we have never left undefined behavor intentionally in the codebase
just because we dont have a actual testcase that fails. (or at least
i hope we never did)
Instead quite the opposite we use heaps of ugly macros to do
unaligned and aliassing violating memory accesses to avoid undefined

> - it doesn't address pretty much anything in "B";

> - it does not address any scenario not tested by make fate (e.g. external
> lib encoders/decoders/filters).
> I'm mostly asking about design here. "Litter around emms_c() calls until
> make fate passes with an assert added" is not a good design, as much as it
> may decrease the number of av_assert2_fpu()s triggered by "make fate". Do
> you think we can do better?

I dont know if we can, but I know that I cannot

> You're calling me aggressive, so to address that, let me try to be
> substantive and give suggestions on how this could be done better. Please
> note that this list is incomplete and is meant to start a discussion on
> this subject, it's not like "if you address these specific issues, I'm OK
> with the patchset". I'm happy to think more about this and give more
> suggestions if you want to hear them. My suggestions:
> - define whether we're trying to fix "A" or "B" from above, because it
> affects the design. I'm assuming you're trying to fix A.

> - #define emms_c() do { /* nothing */ } while (0) on x86-64;

emms_c is needed on x86-64 too

> - add a call to emms_c() in frame allocation routines (this should prevent
> the emms_c() insertions in vp56.c and svq1dec.c);

I tried this before submiting the patchset but it didnt fix much and
it has potential performance implications as this allocation is used
by audio codecs too. (many small frames) Also for this to work you must
do the same to teh deallocation / unref functions too which still fixes
only around something around 10-15 or so fate failures out of the 70
that remained when i tried this.
And with checks and special cases for audio this certainly has nothing
to do with being a clean solution.

> - add a routine to signal that frame processing is complete (like the
> INT_MAX check you're trying to add with frame-threading), and call that in
> all decoders/encoders after block processing (which calls SIMD) is
> complete, so we can context-switch (and emms_c()) regardless of the
> presence of frame-mt or not. It also gets rid of the magic value INT_MAX;
> - consider potential issues with external lib en/decoders which don't have
> much of a presence in "make fate";

> - think about libavfilter.

libavfilter is tested in make fate the same as anything else


PS: i am happy to leave this to someone else if thats preferred.

Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

The real ebay dictionary, page 1
"Used only once"    - "Some unspecified defect prevented a second use"
"In good condition" - "Can be repaird by experienced expert"
"As is" - "You wouldnt want it even if you were payed for it, if you knew ..."
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: Digital signature
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20161023/d0cafcd1/attachment.sig>

More information about the ffmpeg-devel mailing list