[MPlayer-dev-eng] [PATCH] Make mp3lib SIMD optimizations work on AMD64, the Finale
Zuxy Meng
zuxy.meng at gmail.com
Wed Jun 6 11:11:59 CEST 2007
Hi,
2007/6/6, Guillaume Poirier <gpoirier at mplayerhq.hu>:
> Hi,
>
> Zuxy Meng wrote:
>
> >> The last patch will deal with Makefile and macros.
> >
> > The attached patch modifies macros and the Makefile, effectively
> > turning everything on for AMD64. The result is 47% faster decoding on
> > a K8.
>
> Are you sure you attached the right patch? Your patch makes breaks
> linking of MPlayer on Core2/64bits:
>
> cc -o mplayer mplayer.o m_property.o mp_fifo.o mp_msg.o mixer.o
> parser-mpcmd.o subopt-helper.o command.o asxparser.o codec-cfg.o
> cpudetect.o edl.o find_sub.o m_config.o m_option.o m_struct.o
> mpcommon.o parser-cfg.o playtree.o playtreeparser.o spudec.o sub_cc.o
> subreader.o vobsub.o unrarlib.o libvo/libvo.a libao2/libao2.a
> input/libinput.a vidix/libvidix.a libmpcodecs/libmpcodecs.a
> libaf/libaf.a libmpdemux/libmpdemux.a stream/stream.a
> libswscale/libswscale.a libvo/libosd.a libavformat/libavformat.a
> libavcodec/libavcodec.a libavutil/libavutil.a
> libpostproc/libpostproc.a mp3lib/libmp3.a liba52/liba52.a
> libmpeg2/libmpeg2.a libfaad2/libfaad2.a tremor/libvorbisidec.a
> dvdread/libdvdread.a libdvdcss/libdvdcss.a libass/libass.a
> osdep/libosdep.a -lXext -lX11 -lpthread -lXv -lXinerama -lGL -ldl
> -lvga -lSDL -laudio -lXt -ldl -lartsc -lpthread -lgmodule-2.0 -ldl
> -lgthread-2.0 -lglib-2.0 -lesd -laudiofile -lm -ljack -L/usr/lib
> -L/usr/lib -L/usr/lib -Wl,-z,noexecstack -lncurses -lpng -lz -ljpeg
> -lasound -ldl -lpthread -lfreetype -lz -lfontconfig -lz -lmad -lspeex
> -lpthread -ldl -rdynamic -lm
> mp3lib/libmp3.a(sr1.o): In function `MP3_Init':
> sr1.c:(.text+0x1db6): undefined reference to `dct64_MMX'
> collect2: ld returned 1 exit status
>
>
> > Index: mp3lib/Makefile
> > ===================================================================
> > --- mp3lib/Makefile (版本 23483)
> > +++ mp3lib/Makefile (工作副本)
> > @@ -3,18 +3,23 @@
> > LIBNAME_COMMON = libmp3.a
> >
> > SRCS_COMMON = sr1.c
> > +ifeq ($(TARGET_ARCH_X86),yes)
> > +SRCS_COMMON-$(TARGET_MMX) += decode_MMX.c
> > +SRCS_COMMON-$(TARGET_SSE) += dct64_sse.c
> > ifeq ($(TARGET_ARCH_X86_32),yes)
> > SRCS_COMMON += decode_i586.c
> > -SRCS_COMMON-$(TARGET_MMX) += decode_MMX.c dct64_MMX.c
> > +SRCS_COMMON-$(TARGET_MMX) += dct64_MMX.c
>
>
> as far as I can see, you need to move dct64_MMX.c in the
> TARGET_ARCH_X86 section, its code is used in 32 as well as 64-bits mode.
Oops...dct64_MMX shouldn't be referenced under AMD64. Corrected in the
attached patch.
> Also, the speed-up isn't as important here, though I'm not too
> confident in the reliability of my numbers, since -benchmark seems to
> be broken, so I used -speed 100 instead.
>
> Before:
> 0m8.264s
>
> After:
> 0m7.715s
I used the test.c under mp3lib for benchmarking.
--
Zuxy
Beauty is truth,
While truth is beauty.
PGP KeyID: E8555ED6
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: decode_MMX.c.diff
URL: <http://lists.mplayerhq.hu/pipermail/mplayer-dev-eng/attachments/20070606/b38f8126/attachment.txt>
More information about the MPlayer-dev-eng
mailing list