[FFmpeg-devel] [PATCH] split-radix FFT
Måns Rullgård
mans
Tue Aug 5 21:49:46 CEST 2008
Loren Merritt <lorenm at u.washington.edu> writes:
> On Fri, 25 Jul 2008, Loren Merritt wrote:
>
>> $subject, vaguely based on djbfft.
>> Changed from djb:
>> * added simd.
>> * removed the hand-scheduled pentium-pro code. gcc's output from
>> simple C is better on all cpus I have access to.
>> * removed the distinction between fft and ifft. they're just
>> permutations of eachother, so the difference belongs in revtab[] and
>> not in the code.
>> * removed the distinction between pass() and pass_big(). C can
>> always use the memory-efficient version, and simd never does because
>> the shuffles are too costly.
>> * made an entirely different pass_big(), to avoid store->load aliasing.
>
> yasm version.
>
> Not nasm comptabile. In particular, I depend on the assembler to
> optimize away reg*0 in an address, which nasm does in some cases but
> not if there were 3 registers before the zero culling. I could
> workaround this at a cost of about 10 lines of code.
>
> Doesn't distinguish HAVE_YASM from HAVE_MMX.
>
> Doesn't support mingw64. There's no barrier in principle, I just don't
> have a win64 box, so I could never write that version of the
> calling-convention macros.
Does FFmpeg build and run on mingw64 at all?
> From afb93d1dec538e4c886b48479339f3af0742818a Mon Sep 17 00:00:00 2001
> From: Loren Merritt <pengvado at akuvian.org>
> Date: Sat, 2 Aug 2008 02:13:09 -0600
> Subject: [PATCH] yasm buildsystem
>
> ---
> common.mak | 7 +++++++
> configure | 31 +++++++++++++++++++++++++++++++
> 2 files changed, 38 insertions(+), 0 deletions(-)
>
> diff --git a/common.mak b/common.mak
> index 93176c5..17519d9 100644
> --- a/common.mak
> +++ b/common.mak
> @@ -8,6 +8,7 @@ ifndef SUBDIR
> vpath %.c $(SRC_DIR)
> vpath %.h $(SRC_DIR)
> vpath %.S $(SRC_DIR)
> +vpath %.asm $(SRC_DIR)
>
> ifeq ($(SRC_DIR),$(SRC_PATH_BARE))
> BUILD_ROOT_REL = .
OK
> @@ -26,6 +27,9 @@ CFLAGS := -DHAVE_AV_CONFIG_H -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE \
> %.o: %.S
> $(CC) $(CFLAGS) $(LIBOBJFLAGS) -c -o $@ $<
>
> +%.o: %.asm
> + $(YASM) $(YASMFLAGS) -I $(<D)/ -o $@ $<
> +
> %.ho: %.h
> $(CC) $(CFLAGS) $(LIBOBJFLAGS) -Wno-unused -c -o $@ -x c $<
>
I'd rather not tie as generic a filename pattern as *.asm to an
x86-only assembler. Does it work if you rewrite the rule as
$(SUBDIR)i386/%.o: $(SUBDIR)i386/%.asm
and move it to the RULES macro further down the file? That's the best
solution I can think of right now.
> @@ -38,6 +42,9 @@ CFLAGS := -DHAVE_AV_CONFIG_H -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE \
> %.d: %.cpp
> $(DEPEND_CMD) > $@
>
> +%.d: %.asm
> + $(YASM) $(YASMFLAGS) -I $(<D)/ -M -o $(@:%.d=%.o) $< > $@
> +
> %.o: %.d
>
> %$(EXESUF): %.c
Ditto.
> diff --git a/configure b/configure
> index 564ff03..d11a331 100755
> --- a/configure
> +++ b/configure
> @@ -444,6 +444,13 @@ int foo(void){ asm volatile($asm); }
> EOF
> }
>
> +check_yasm(){
> + log check_yasm "$@"
> + echo "$1" > $TMPS
> + shift 1
> + check_cmd $yasm $YASMFLAGS "$@" -o $TMPO $TMPS
> +}
> +
The test file should be written to the log using log_file $TMPS.
> check_ld(){
> log check_ld "$@"
> check_cc || return
> @@ -927,6 +934,7 @@ shlibdir_default="$libdir_default"
>
> # toolchain
> cc="gcc"
> +yasm="yasm"
> ar="ar"
> nm="nm"
> ranlib="ranlib"
> @@ -1089,6 +1097,7 @@ echo "# $0 $@" > $logfile
> set >> $logfile
>
> cc="${cross_prefix}${cc}"
> +yasm="${cross_prefix}${yasm}"
> ar="${cross_prefix}${ar}"
> nm="${cross_prefix}${nm}"
> ranlib="${cross_prefix}${ranlib}"
OK
> @@ -1179,6 +1188,8 @@ enable $arch
> enabled_any x86_32 x86_64 && enable x86
> enabled sparc64 && enable sparc
>
> +objformat="elf"
> +
This should go in a more natural place. Next to the other
toolchain-related variables seems good.
> # OS specific
> case $target_os in
> beos|haiku|zeta)
> @@ -1243,6 +1254,7 @@ case $target_os in
> SLIBNAME_WITH_VERSION='$(SLIBPREF)$(FULLNAME).$(LIBVERSION)$(SLIBSUF)'
> SLIBNAME_WITH_MAJOR='$(SLIBPREF)$(FULLNAME).$(LIBMAJOR)$(SLIBSUF)'
> FFSERVERLDFLAGS=-Wl,-bind_at_load
> + objformat="macho"
> ;;
> mingw32*)
> target_os=mingw32
> @@ -1269,6 +1281,7 @@ case $target_os in
> install -m 644 $(SUBDIR)$(SLIBNAME_WITH_MAJOR:$(SLIBSUF)=.lib) "$(SHLIBDIR)/$(SLIBNAME_WITH_MAJOR:$(SLIBSUF)=.lib)"'
> SLIB_UNINSTALL_EXTRA_CMD='rm -f "$(SHLIBDIR)/$(SLIBNAME:$(SLIBSUF)=.lib)"'
> SHFLAGS='-shared -Wl,--output-def,$$(@:$(SLIBSUF)=.def) -Wl,--enable-runtime-pseudo-reloc -Wl,--enable-auto-image-base'
> + objformat="win32"
> ;;
> cygwin*)
> target_os=cygwin
> @@ -1285,12 +1298,14 @@ case $target_os in
> SLIBNAME_WITH_VERSION='$(SLIBPREF)$(FULLNAME)-$(LIBVERSION)$(SLIBSUF)'
> SLIBNAME_WITH_MAJOR='$(SLIBPREF)$(FULLNAME)-$(LIBMAJOR)$(SLIBSUF)'
> SHFLAGS='-shared -Wl,--enable-auto-image-base'
> + objformat="win32"
> ;;
> *-dos|freedos|opendos)
> disable ffplay ffserver vhook
> disable $INDEV_LIST $OUTDEV_LIST
> network_extralibs="-lsocket"
> EXESUF=".exe"
> + objformat="win32"
> ;;
> linux)
> LDLATEFLAGS="-Wl,--as-needed $LDLATEFLAGS"
Is "win32" a commonly used name for that object file format?
> @@ -1534,6 +1549,20 @@ EOF
> enabled mmx2 && check_asm mmx2 '"movss %xmm0, %xmm0"'
>
> check_asm bswap '"bswap %%eax" ::: "%eax"'
> +
> + if test $arch = x86_64; then
enabled x86_64
> + YASMFLAGS="-f ${objformat}64 -DARCH_X86_64"
Is "macho64" a valid object format name? I doubt "win3264" is...
> + enabled shared && YASMFLAGS="$YASMFLAGS -DPIC"
append YASMFLAGS -DPIC
> + else
> + YASMFLAGS="-f $objformat -DARCH_X86_32"
> + fi
> + if test $objformat = elf; then
> + enabled debug && YASMFLAGS="$YASMFLAGS -g dwarf2"
> + else
> + YASMFLAGS="$YASMFLAGS -DPREFIX"
> + fi
append ...
> + # FIXME: just disable yasm? but both the exe name and the enablement flag are naturally $yasm ...
Then rename one.
> + check_yasm "pabsw xmm0, xmm0" || disable mmx2
Isn't this a little harsh?
> fi
>
> # check for assembler specific support
> @@ -2028,6 +2057,7 @@ echo "INCDIR=\$(DESTDIR)$incdir" >> config.mak
> echo "BINDIR=\$(DESTDIR)$bindir" >> config.mak
> echo "MANDIR=\$(DESTDIR)$mandir" >> config.mak
> echo "CC=$cc" >> config.mak
> +echo "YASM=$yasm" >> config.mak
> echo "AR=$ar" >> config.mak
> echo "RANLIB=$ranlib" >> config.mak
> echo "LN_S=$ln_s" >> config.mak
> @@ -2040,6 +2070,7 @@ echo "VHOOKCFLAGS=$VHOOKCFLAGS" >> config.mak
> echo "LDFLAGS=$LDFLAGS" >> config.mak
> echo "FFSERVERLDFLAGS=$FFSERVERLDFLAGS" >> config.mak
> echo "SHFLAGS=$SHFLAGS" >> config.mak
> +echo "YASMFLAGS=$YASMFLAGS" >> config.mak
> echo "VHOOKSHFLAGS=$VHOOKSHFLAGS" >> config.mak
> echo "VHOOKLIBS=$VHOOKLIBS" >> config.mak
> echo "LIBOBJFLAGS=$LIBOBJFLAGS" >> config.mak
OK
--
M?ns Rullg?rd
mans at mansr.com
More information about the ffmpeg-devel
mailing list