[FFmpeg-devel] [PATCH 09/13] avcodec/svq1dec: clear MMX state after MB decode loop

Ronald S. Bultje rsbultje at gmail.com
Sun Oct 23 20:02:01 EEST 2016


On Sat, Oct 22, 2016 at 11:36 PM, Michael Niedermayer <
michael at niedermayer.cc> wrote:

> On Sat, Oct 22, 2016 at 10:10:01PM -0400, Ronald S. Bultje wrote:
> > Hi,
> >
> > general comment about all other dec patches.
> >
> > On Sat, Oct 22, 2016 at 3:02 PM, Michael Niedermayer
> <michael at niedermayer.cc
> > > wrote:
> >
> > > Signed-off-by: Michael Niedermayer <michael at niedermayer.cc>
> > > ---
> > >  libavcodec/svq1dec.c | 2 ++
> > >  1 file changed, 2 insertions(+)
> > >
> > > diff --git a/libavcodec/svq1dec.c b/libavcodec/svq1dec.c
> > > index 2b72e08..0fe222e 100644
> > > --- a/libavcodec/svq1dec.c
> > > +++ b/libavcodec/svq1dec.c
> > > @@ -744,6 +744,7 @@ static int svq1_decode_frame(AVCodecContext
> *avctx,
> > > void *data,
> > >              }
> > >          }
> > >      }
> > > +    emms_c();
> >
> >
> > This is hideous, you're sprinkling emms_c in various places to make some
> > stupid test pass. The test is morbidly stupid and there is no general
> > consensus on patterns to be followed as for where to place emms_c.
> Someone
> > who doesn't know any better will litter each new decoder with 10-20 calls
> > to emms_c just because he found that other decoders do it in
> undocumented,
> > unexplained and unclear locations also.
> >
> > If you want this to be a "thing", you need to design and document
> carefully
> > where emms_c is necessary. Then come up with some system that makes this
> > work by itself.
> Your mail sounds quite aggressive to me, did i say something to anger
> you ? It was certainly not intended
> About this patchset
> FFmpeg is broken ATM (both in theory and practice with some libcs),
> the way MMX code is used is not correct, emms_c()
> calls are missing and required. The obvious way to fix that
> (in practice) is adding emms_c() calls where they are missing.
> I can document why each call is needed, if thats wanted?

Your representation of facts is strange, to say the least. Let's explore
two related claims:

- (A) ffmpeg is broken in practice when calling musl malloc/free functions
after calling MMX SIMD functions on x86-32 (which crashes).
- (B) ffmpeg is broken in theory because we don't clear FPU state (as
required) at the end of MMX SIMD functions.

Which are you trying to fix? You make it sound like you're fixing (B) (you
use the plural "libcs" and often use the word "required" or "standard"),
but your patchset only addresses a small subset of (A). For example, you're
not clearing FPU state at the end of a SIMD function if it's followed by a
SIMD function that calls memcpy in the C implementation.

But more importantly, in doing so (regardless whether you're trying to fix
"A" or "B"), this patchset:
- has no design (or at least, I can't find it);
- it litters the code with unstructured calls to emms_c(), which seem
specifically placed to fix make fate with an assert added in av_malloc/free;
- it adds unnecessary (to fix "A") calls to emms_c() on x86-64;
- it doesn't address pretty much anything in "B";
- it does not address any scenario not tested by make fate (e.g. external
lib encoders/decoders/filters).

I'm mostly asking about design here. "Litter around emms_c() calls until
make fate passes with an assert added" is not a good design, as much as it
may decrease the number of av_assert2_fpu()s triggered by "make fate". Do
you think we can do better?

You're calling me aggressive, so to address that, let me try to be
substantive and give suggestions on how this could be done better. Please
note that this list is incomplete and is meant to start a discussion on
this subject, it's not like "if you address these specific issues, I'm OK
with the patchset". I'm happy to think more about this and give more
suggestions if you want to hear them. My suggestions:
- define whether we're trying to fix "A" or "B" from above, because it
affects the design. I'm assuming you're trying to fix A.
- #define emms_c() do { /* nothing */ } while (0) on x86-64;
- add a call to emms_c() in frame allocation routines (this should prevent
the emms_c() insertions in vp56.c and svq1dec.c);
- add a routine to signal that frame processing is complete (like the
INT_MAX check you're trying to add with frame-threading), and call that in
all decoders/encoders after block processing (which calls SIMD) is
complete, so we can context-switch (and emms_c()) regardless of the
presence of frame-mt or not. It also gets rid of the magic value INT_MAX;
- consider potential issues with external lib en/decoders which don't have
much of a presence in "make fate";
- think about libavfilter.

I'm fully aware that this is not a complete design that gets rid of calls
to emms_c() in all decoders. I'm also fully aware that some decoders
already call emms_c() because they signal when one line of blocks is
completed, which is a user-callback (which may need a cleared FPU state).
I'm not asking you to get rid of any of that, I'm just asking you to not
make it exponentially worse by designing this a little bit more carefully.

Thank you,

More information about the ffmpeg-devel mailing list