[FFmpeg-devel] [PATCH] Detect and check for CMOV.
Michael Niedermayer
michaelni at gmx.at
Sat Feb 11 21:15:11 CET 2012
On Sat, Feb 11, 2012 at 04:07:10PM +0100, Reimar Döffinger wrote:
> Some MMX-only CPUs do not have support for CMOV.
> All SSE/MMX2 CPUs should be fine, thus no check was
> added to those functions.
> See also https://sourceforge.net/tracker/?func=detail&aid=3358347&group_id=205275&atid=992986
>
> Signed-off-by: Reimar Döffinger <Reimar.Doeffinger at gmx.de>
> ---
> libavcodec/x86/h264_intrapred_init.c | 3 ++-
> libavcodec/x86/h264dsp_mmx.c | 3 ++-
> libavutil/cpu.h | 1 +
> libavutil/x86/cpu.c | 2 ++
> 4 files changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/libavcodec/x86/h264_intrapred_init.c b/libavcodec/x86/h264_intrapred_init.c
> index 540ec87..58740e2 100644
> --- a/libavcodec/x86/h264_intrapred_init.c
> +++ b/libavcodec/x86/h264_intrapred_init.c
> @@ -188,7 +188,8 @@ void ff_h264_pred_init_x86(H264PredContext *h, int codec_id, const int bit_depth
> if (chroma_format_idc == 1)
> h->pred8x8 [PLANE_PRED8x8] = ff_pred8x8_plane_mmx;
> if (codec_id == CODEC_ID_SVQ3) {
> - h->pred16x16[PLANE_PRED8x8] = ff_pred16x16_plane_svq3_mmx;
> + if (mm_flags & AV_CPU_FLAG_CMOV)
> + h->pred16x16[PLANE_PRED8x8] = ff_pred16x16_plane_svq3_mmx;
> } else if (codec_id == CODEC_ID_RV40) {
> h->pred16x16[PLANE_PRED8x8] = ff_pred16x16_plane_rv40_mmx;
> } else {
> diff --git a/libavcodec/x86/h264dsp_mmx.c b/libavcodec/x86/h264dsp_mmx.c
> index b337462..063e3de 100644
> --- a/libavcodec/x86/h264dsp_mmx.c
> +++ b/libavcodec/x86/h264dsp_mmx.c
> @@ -361,7 +361,8 @@ void ff_h264dsp_init_x86(H264DSPContext *c, const int bit_depth, const int chrom
> if (chroma_format_idc == 1)
> c->h264_idct_add8 = ff_h264_idct_add8_8_mmx;
> c->h264_idct_add16intra = ff_h264_idct_add16intra_8_mmx;
> - c->h264_luma_dc_dequant_idct= ff_h264_luma_dc_dequant_idct_mmx;
> + if (mm_flags & AV_CPU_FLAG_CMOV)
> + c->h264_luma_dc_dequant_idct= ff_h264_luma_dc_dequant_idct_mmx;
>
> if (mm_flags & AV_CPU_FLAG_MMX2) {
> c->h264_idct_dc_add = ff_h264_idct_dc_add_8_mmx2;
> diff --git a/libavutil/cpu.h b/libavutil/cpu.h
> index 5f7eed2..564d76f 100644
> --- a/libavutil/cpu.h
> +++ b/libavutil/cpu.h
> @@ -38,6 +38,7 @@
> #define AV_CPU_FLAG_SSE4 0x0100 ///< Penryn SSE4.1 functions
> #define AV_CPU_FLAG_SSE42 0x0200 ///< Nehalem SSE4.2 functions
> #define AV_CPU_FLAG_AVX 0x4000 ///< AVX functions: requires OS support even if YMM registers aren't used
> +#define AV_CPU_FLAG_CMOV 0x10000 ///< supports cmov instruction
> #define AV_CPU_FLAG_XOP 0x0400 ///< Bulldozer XOP functions
> #define AV_CPU_FLAG_FMA4 0x0800 ///< Bulldozer FMA4 functions
> #define AV_CPU_FLAG_IWMMXT 0x0100 ///< XScale IWMMXT
please use a value more distant from the existing so chances of ABI
conflicts are decreased
rest of the patch LGTM if you volunteer to maintain it.
(maintain here probably means revert in case someone changes the asm
so it works without cmov)
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
Asymptotically faster algorithms should always be preferred if you have
asymptotical amounts of data
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20120211/49ded162/attachment.asc>
More information about the ffmpeg-devel
mailing list