[FFmpeg-devel] [PATCH] Detect and check for CMOV.

Reimar Döffinger Reimar.Doeffinger at gmx.de
Sun Feb 12 19:07:45 CET 2012


On Sat, Feb 11, 2012 at 09:15:11PM +0100, Michael Niedermayer wrote:
> On Sat, Feb 11, 2012 at 04:07:10PM +0100, Reimar Döffinger wrote:
> > Some MMX-only CPUs do not have support for CMOV.
> > All SSE/MMX2 CPUs should be fine, thus no check was
> > added to those functions.
> > See also https://sourceforge.net/tracker/?func=detail&aid=3358347&group_id=205275&atid=992986
> > 
> > Signed-off-by: Reimar Döffinger <Reimar.Doeffinger at gmx.de>
> > ---
> >  libavcodec/x86/h264_intrapred_init.c |    3 ++-
> >  libavcodec/x86/h264dsp_mmx.c         |    3 ++-
> >  libavutil/cpu.h                      |    1 +
> >  libavutil/x86/cpu.c                  |    2 ++
> >  4 files changed, 7 insertions(+), 2 deletions(-)
> > 
> > diff --git a/libavcodec/x86/h264_intrapred_init.c b/libavcodec/x86/h264_intrapred_init.c
> > index 540ec87..58740e2 100644
> > --- a/libavcodec/x86/h264_intrapred_init.c
> > +++ b/libavcodec/x86/h264_intrapred_init.c
> > @@ -188,7 +188,8 @@ void ff_h264_pred_init_x86(H264PredContext *h, int codec_id, const int bit_depth
> >                  if (chroma_format_idc == 1)
> >                      h->pred8x8  [PLANE_PRED8x8] = ff_pred8x8_plane_mmx;
> >                  if (codec_id == CODEC_ID_SVQ3) {
> > -                    h->pred16x16[PLANE_PRED8x8] = ff_pred16x16_plane_svq3_mmx;
> > +                    if (mm_flags & AV_CPU_FLAG_CMOV)
> > +                        h->pred16x16[PLANE_PRED8x8] = ff_pred16x16_plane_svq3_mmx;
> >                  } else if (codec_id == CODEC_ID_RV40) {
> >                      h->pred16x16[PLANE_PRED8x8] = ff_pred16x16_plane_rv40_mmx;
> >                  } else {
> > diff --git a/libavcodec/x86/h264dsp_mmx.c b/libavcodec/x86/h264dsp_mmx.c
> > index b337462..063e3de 100644
> > --- a/libavcodec/x86/h264dsp_mmx.c
> > +++ b/libavcodec/x86/h264dsp_mmx.c
> > @@ -361,7 +361,8 @@ void ff_h264dsp_init_x86(H264DSPContext *c, const int bit_depth, const int chrom
> >          if (chroma_format_idc == 1)
> >              c->h264_idct_add8       = ff_h264_idct_add8_8_mmx;
> >          c->h264_idct_add16intra     = ff_h264_idct_add16intra_8_mmx;
> > -        c->h264_luma_dc_dequant_idct= ff_h264_luma_dc_dequant_idct_mmx;
> > +        if (mm_flags & AV_CPU_FLAG_CMOV)
> > +            c->h264_luma_dc_dequant_idct= ff_h264_luma_dc_dequant_idct_mmx;
> >  
> >          if (mm_flags & AV_CPU_FLAG_MMX2) {
> >              c->h264_idct_dc_add    = ff_h264_idct_dc_add_8_mmx2;
> > diff --git a/libavutil/cpu.h b/libavutil/cpu.h
> > index 5f7eed2..564d76f 100644
> > --- a/libavutil/cpu.h
> > +++ b/libavutil/cpu.h
> > @@ -38,6 +38,7 @@
> >  #define AV_CPU_FLAG_SSE4         0x0100 ///< Penryn SSE4.1 functions
> >  #define AV_CPU_FLAG_SSE42        0x0200 ///< Nehalem SSE4.2 functions
> >  #define AV_CPU_FLAG_AVX          0x4000 ///< AVX functions: requires OS support even if YMM registers aren't used
> > +#define AV_CPU_FLAG_CMOV        0x10000 ///< supports cmov instruction
> >  #define AV_CPU_FLAG_XOP          0x0400 ///< Bulldozer XOP functions
> >  #define AV_CPU_FLAG_FMA4         0x0800 ///< Bulldozer FMA4 functions
> >  #define AV_CPU_FLAG_IWMMXT       0x0100 ///< XScale IWMMXT
> 
> please use a value more distant from the existing so chances of ABI
> conflicts are decreased
> 
> rest of the patch LGTM if you volunteer to maintain it.
> (maintain here probably means revert in case someone changes the asm
>  so it works without cmov)

I'll try to keep my eye out for such changes.
Though I have some doubts all too many people will care to optimize
a function that is only rarely used in H.264 for the original
Pentium MMX - and some AMD K6...


More information about the ffmpeg-devel mailing list