[FFmpeg-devel] Patch: Inline asm fixes for Intel compiler on Windows
Michael Niedermayer
michaelni at gmx.at
Sun Jan 5 06:36:56 CET 2014
On Sun, Jan 05, 2014 at 01:27:26PM +1100, Matt Oliver wrote:
> On 30 December 2013 13:06, Matt Oliver <protogonoi at gmail.com> wrote:
>
> > On 30 December 2013 02:26, Michael Niedermayer <michaelni at gmx.at> wrote:
> >
> >> On Sun, Dec 29, 2013 at 02:19:40PM +0100, Reimar Döffinger wrote:
> >> > On Sun, Dec 29, 2013 at 02:48:52PM +1100, Matt Oliver wrote:
> >> > > is to much to ask. The only other option I can think of is to have
> >> > > different versions. i.e. for those few lines with direct symbol
> >> references
> >> > > have an #ifdef that only changes the code for Intel compiler on
> >> Windows.
> >> > > Windows shared libraries dont need PIC so this shouldn't be a problem.
> >> > > Although the code may become a bit ugly.
> >> >
> >> > One thing I thought of: You could try using named asm constraints, and
> >> > only change the MANGLE to convert to a reference to a named asm
> >> > argument.
> >> > If you are lucky in quite a few cases you'd then only need an ifdef
> >> > around the extra asm constraints.
> >>
> >> that should be possible without ifdef, see XMM_CLOBBERS_ONLY() for
> >> something similar
> >>
> >> [...]
> >> --
> >> Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
> >>
> >> You can kill me, but you cannot change the truth.
> >>
> >
> > Using named asm constraints is a good idea as it seems that Intel compiler
> > supports those. So I will update the patch so that MANGLE remains in use
> > but it will just change to a named asm constraint on Intel+Windows. Combine
> > that with a macro such as XMM_CLOBBERS_ONLY in order to add the necessary
> > asm interfaces and that should work fine.
> >
> > That one thing I did notice though is that if the use of "m" constraints
> > with PIC is an issue then there are several locations in existing code that
> > dont seem to follow that. For instance in libavcodec/x86/cavsdsp.c the
> > macro QPEL_CAVSV1 uses a asm interface for MUL1 (which is variable ff_pw_96
> > and is assigned to %5). However MUL2 (which is variable ff_pw_42) is used
> > as a direct symbol through MANGLE. Since both these variables are defined
> > in the same place and are of the same type (uint64_t) then it appears there
> > are some inconsistencies in existing FFmpeg code. Ill leave all existing
> > instances of MANGLE but i thought id just point this out.
> >
>
> OK so I have redone the patch using named asm constraints and a macro to
> add them to the asm-interface (similar to XMM_CLOBBERS_ONLY as discussed).
>
> There are only 2 locations that are slightly different and these are in
> motion_est.c and vf_fspp.c.
> The inline asm in Intel compiler is rather
> unstable and changing some compilation flags suddenly result in errors from
> what was previously perfectly functional asm. To get compilation to work
hmm
> using standard release options 2 variables were changed from direct symbol
> references to an asm-interface. This should not be a problem as these
> symbols are defined in asm-interfaces already (for instance in vf_fspp.c
> temps[0] is already in an interface and all I did was add temps[1]). So the
> patch does not change anything that didnt already exist.
>
> To ensure everything stayed the same I compiled using gcc 4.8.2 and checked
> the generated disassembly for the part affected. For all changes the
> resulting code is exactly identical so this does not change anything (which
> for the most part it shouldn't anyway as the macros only change when using
> Intel on Windows). So no surprises there but I thought i would be thorough
> just to be sure. So obviously all FATE tests passed.
>
> Let me know how this one looks.
> libavcodec/x86/cabac.h | 3 +
> libavcodec/x86/cavsdsp.c | 2 +
> libavcodec/x86/constants.h | 2 +
> libavcodec/x86/dsputil_mmx.c | 1
> libavcodec/x86/h264_i386.h | 3 +
> libavcodec/x86/idct_sse2_xvid.c | 18 +++++-----
> libavcodec/x86/lpc.c | 3 +
> libavcodec/x86/mathops.h | 2 -
> libavcodec/x86/mlpdsp.c | 5 +-
> libavcodec/x86/motion_est.c | 4 +-
> libavcodec/x86/mpegvideoenc_template.c | 4 +-
> libavcodec/x86/simple_idct.c | 1
> libavcodec/x86/vc1dsp_mmx.c | 6 +++
> libavfilter/libmpcodecs/vf_fspp.c | 58 ++++++++++++++++++---------------
> libavfilter/libmpcodecs/vf_ilpack.c | 8 ++--
> libavfilter/libmpcodecs/vf_pp7.c | 2 -
> libavfilter/libmpcodecs/vf_uspp.c | 2 -
> libavfilter/vf_drawtext.c | 4 +-
> libavutil/mathematics.h | 3 +
> libavutil/x86/asm.h | 43 +++++++++++++++++++++++-
> libpostproc/postprocess_template.c | 7 +++
> libswresample/x86/resample_mmx.h | 2 +
> libswscale/x86/rgb2rgb.c | 8 ++++
> libswscale/x86/rgb2rgb_template.c | 11 ++++++
> libswscale/x86/swscale_template.c | 27 +++++++++++++++
> libswscale/x86/yuv2rgb.c | 18 +++++-----
> libswscale/x86/yuv2rgb_template.c | 21 ++++++++++-
> 27 files changed, 205 insertions(+), 63 deletions(-)
> d26137cd7e89161d06bef0332b959df68afaadbb Fixed-inline-asm-compilation-with-Intel-compiler.patch
> From 6eea177a4761629df5d6f02359fccc4efb116aed Mon Sep 17 00:00:00 2001
> From: Matt Oliver <protogonoi at gmail.com>
> Date: Sun, 5 Jan 2014 13:09:25 +1100
> Subject: [PATCH] Fixed inline asm compilation with Intel compiler,
>
> ---
> libavcodec/x86/cabac.h | 3 +-
> libavcodec/x86/cavsdsp.c | 2 ++
> libavcodec/x86/constants.h | 2 ++
> libavcodec/x86/dsputil_mmx.c | 1 +
> libavcodec/x86/h264_i386.h | 3 +-
> libavcodec/x86/idct_sse2_xvid.c | 18 +++++------
> libavcodec/x86/lpc.c | 3 ++
> libavcodec/x86/mathops.h | 2 +-
> libavcodec/x86/mlpdsp.c | 5 +--
> libavcodec/x86/motion_est.c | 4 ++-
> libavcodec/x86/mpegvideoenc_template.c | 4 +--
> libavcodec/x86/simple_idct.c | 1 +
> libavcodec/x86/vc1dsp_mmx.c | 6 ++++
> libavfilter/libmpcodecs/vf_fspp.c | 58 +++++++++++++++++++---------------
> libavfilter/libmpcodecs/vf_ilpack.c | 8 ++---
> libavfilter/libmpcodecs/vf_pp7.c | 2 +-
> libavfilter/libmpcodecs/vf_uspp.c | 2 +-
> libavfilter/vf_drawtext.c | 4 ++-
> libavutil/mathematics.h | 3 ++
> libavutil/x86/asm.h | 43 ++++++++++++++++++++++++-
> libpostproc/postprocess_template.c | 7 ++++
> libswresample/x86/resample_mmx.h | 2 ++
> libswscale/x86/rgb2rgb.c | 8 +++++
> libswscale/x86/rgb2rgb_template.c | 11 +++++++
> libswscale/x86/swscale_template.c | 27 ++++++++++++++++
> libswscale/x86/yuv2rgb.c | 18 +++++------
> libswscale/x86/yuv2rgb_template.c | 21 ++++++++++--
> 27 files changed, 205 insertions(+), 63 deletions(-)
>
> diff --git a/libavcodec/x86/cabac.h b/libavcodec/x86/cabac.h
> index 558d287..fe1eb32 100644
> --- a/libavcodec/x86/cabac.h
> +++ b/libavcodec/x86/cabac.h
> @@ -34,7 +34,8 @@
> # define BROKEN_COMPILER 0
> #endif
>
> -#if HAVE_INLINE_ASM
> +#if HAVE_INLINE_ASM && !( defined(__INTEL_COMPILER) && defined(_MSC_VER) )
that condition occurs alot maybe something like
HAVE_INTEL_GAS or .._ASSEMBLER or something
could be used to simplify these and also test for the actual thing
that causes the problems assuming that is what causes the problem
[...]
> @@ -266,7 +267,7 @@ static inline void sad8_4_mmx(uint8_t *blk1, uint8_t *blk2, int stride, int h)
> "punpckhbw %%mm7, %%mm5 \n\t"
> "paddw %%mm4, %%mm2 \n\t"
> "paddw %%mm5, %%mm3 \n\t"
> - "movq 16+"MANGLE(round_tab)", %%mm5 \n\t"
> + "movq %5, %%mm5 \n\t"
> "paddw %%mm2, %%mm0 \n\t"
> "paddw %%mm3, %%mm1 \n\t"
> "paddw %%mm5, %%mm0 \n\t"
> @@ -290,6 +291,7 @@ static inline void sad8_4_mmx(uint8_t *blk1, uint8_t *blk2, int stride, int h)
> " js 1b \n\t"
> : "+a" (len)
> : "r" (blk1 - len), "r" (blk1 -len + stride), "r" (blk2 - len), "r" ((x86_reg)stride)
> + , "m"(round_tab[2])
a int64 instead of a array could be used
[...]
> diff --git a/libavfilter/libmpcodecs/vf_fspp.c b/libavfilter/libmpcodecs/vf_fspp.c
> index a8a33e2..290057f 100644
> --- a/libavfilter/libmpcodecs/vf_fspp.c
> +++ b/libavfilter/libmpcodecs/vf_fspp.c
changes to libmpcodecs should be sent to mplayer-dev
[...]
> index e4315ef..226b415 100644
> --- a/libswscale/x86/yuv2rgb.c
> +++ b/libswscale/x86/yuv2rgb.c
> @@ -48,15 +48,6 @@ DECLARE_ASM_CONST(8, uint64_t, pb_e0) = 0xe0e0e0e0e0e0e0e0ULL;
> DECLARE_ASM_CONST(8, uint64_t, pb_03) = 0x0303030303030303ULL;
> DECLARE_ASM_CONST(8, uint64_t, pb_07) = 0x0707070707070707ULL;
>
> -//MMX versions
> -#if HAVE_MMX_INLINE
> -#undef RENAME
> -#undef COMPILE_TEMPLATE_MMXEXT
> -#define COMPILE_TEMPLATE_MMXEXT 0
> -#define RENAME(a) a ## _mmx
> -#include "yuv2rgb_template.c"
> -#endif /* HAVE_MMX_INLINE */
> -
> // MMXEXT versions
> #if HAVE_MMXEXT_INLINE
> #undef RENAME
> @@ -66,6 +57,15 @@ DECLARE_ASM_CONST(8, uint64_t, pb_07) = 0x0707070707070707ULL;
> #include "yuv2rgb_template.c"
> #endif /* HAVE_MMXEXT_INLINE */
>
> +//MMX versions
> +#if HAVE_MMX_INLINE
> +#undef RENAME
> +#undef COMPILE_TEMPLATE_MMXEXT
> +#define COMPILE_TEMPLATE_MMXEXT 0
> +#define RENAME(a) a ## _mmx
> +#include "yuv2rgb_template.c"
> +#endif /* HAVE_MMX_INLINE */
> +
> #endif /* HAVE_INLINE_ASM */
>
> av_cold SwsFunc ff_yuv2rgb_init_x86(SwsContext *c)
why ?
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
It is what and why we do it that matters, not just one of them.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20140105/29907358/attachment.asc>
More information about the ffmpeg-devel
mailing list