[FFmpeg-devel] [PATCH] configure: replace arch loongson with arch extra list loongson

周晓勇 zhouxiaoyong at loongson.cn
Thu May 7 04:25:33 CEST 2015




> -----原始邮件-----
> 发件人: "Michael Niedermayer" <michael at niedermayer.cc>
> 发送时间: 2015年5月6日 星期三
> 收件人: "FFmpeg development discussions and patches" <ffmpeg-devel at ffmpeg.org>
> 抄送: gaoxiang <gaoxiang at loongson.cn>, "孟小甫" <mengxiaofu at loongson.cn>
> 主题: Re: [FFmpeg-devel] [PATCH] configure: replace arch loongson with arch extra list loongson
> 
> On Wed, May 06, 2015 at 02:38:21PM +0800, 周晓勇 wrote:
> > From a5031b4c4b97f790a40603cff9a1f45cbb043005 Mon Sep 17 00:00:00 2001
> > From: ZhouXiaoyong <zhouxiaoyong at loongson.cn>
> > Date: Wed, 6 May 2015 14:05:21 +0800
> > Subject: [PATCH] configure: replace arch loongson with arch extra list loongson
> > 
> > fate pass when do configure without --cc='ccache gcc' option:
> > ./configure --enable-gpl --enable-pthreads --samples=/home/loongson/fate/
> >  --enable-nonfree --enable-version3 --assert-level=2 --cpu=loongson3a
> >  --enable-loongson3
> 
> with this ARCH_MIPS64 is disabled, is this intended ?
> 
ARCH_MIPS64 only be used in libavutil/mips/intereadwrite.h for AV_RN32. i mean to not disturb other MIPS64 machines, and Loongson's optimization maybe not compatible for other MIPS64 before tested. as i have no MIPS64 machine expect Loongson3 for testing.
In my personal git-devel branch, i have optimized the other funcs for Loongson-3, such as AV_WN32, AV_RN64, AV_WN64, AV_COPY32, AV_COPY64, AV_SWAP64, AV_ZERO32, AV_ZERO_64.
But, its boost gain little than anticipant. i will do more test to make sure the optimized intreadwrite boost truely, then send u the patch.

> why is "--enable-loongson3" needed when "--cpu=loongson3a" is already
> specified ?
> 
no need, i just add on to make sure the SIMD optimization enabled.

> and fate still fails
> time ./configure --enable-gpl --enable-pthreads --samples=/home/loongson/fate/  --enable-nonfree --enable-version3 --assert-level=2 --cpu=loongson3a --enable-loongson3
> real    4m48.779s
> user    4m13.918s
> sys     0m40.020s
> 
> time make -j4
> real    19m31.114s
> user    57m52.785s
> sys     2m52.359s
> 
> make -j5 fate-vsynth1-rv10 fate-vsynth1-svq1 fate-amrwb-23k85 fate-dss-lp fate-lavf-avi
> 
> --- ./tests/ref/fate/dss-lp     2015-05-06 01:16:58.238387245 +0800
> +++ tests/data/fate/dss-lp      2015-05-06 20:15:23.060689405 +0800
> @@ -1,31 +1,31 @@
>  #tb 0: 1/8000
> -0,          0,          0,      240,      480, 0xf1107658
> -0,        240,        240,      240,      480, 0x50dee179
> -0,        480,        480,      240,      480, 0x40090802
> -0,        720,        720,      240,      480, 0x3ef9f6ff
> -0,        960,        960,      240,      480, 0x5b7df231
> -0,       1200,       1200,      240,      480, 0xe266efd1
> -0,       1440,       1440,      240,      480, 0xfbe6e658
> -0,       1680,       1680,      240,      480, 0xde84f311
> -0,       1920,       1920,      240,      480, 0x5854ec2f
> -0,       2160,       2160,      240,      480, 0x4901cdea
> -0,       2400,       2400,      240,      480, 0x03f3e619
> -0,       2640,       2640,      240,      480, 0x47abfe87
> -0,       2880,       2880,      240,      480, 0x69dddf34
> -0,       3120,       3120,      240,      480, 0x1cfeee2c
> -0,       3360,       3360,      240,      480, 0x1860ef1c
> -0,       3600,       3600,      240,      480, 0x8f86e8ed
> -0,       3840,       3840,      240,      480, 0x307deaf8
> -0,       4080,       4080,      240,      480, 0xeca7eca0
> -0,       4320,       4320,      240,      480, 0x1835ee1c
> -0,       4560,       4560,      240,      480, 0x6676ed66
> -0,       4800,       4800,      240,      480, 0x49c2fd04
> -0,       5040,       5040,      240,      480, 0xc463db75
> -0,       5280,       5280,      240,      480, 0x1931ed7d
> -0,       5520,       5520,      240,      480, 0xc99ff886
> -0,       5760,       5760,      240,      480, 0xcd3ae8de
> -0,       6000,       6000,      240,      480, 0x2294ecfa
> -0,       6240,       6240,      240,      480, 0xcf5ef14b
> -0,       6480,       6480,      240,      480, 0x6325d4fe
> -0,       6720,       6720,      240,      480, 0x3790dcf2
> -0,       6960,       6960,      240,      480, 0x0fbee6c0
> +0,          0,          0,      240,      480, 0x4f3de452
> +0,        240,        240,      240,      480, 0x55d1f9da
> +0,        480,        480,      240,      480, 0xe887e1f6
> +0,        720,        720,      240,      480, 0xc353f768
> +0,        960,        960,      240,      480, 0x34adebcc
> +0,       1200,       1200,      240,      480, 0x7d67dfa2
> +0,       1440,       1440,      240,      480, 0xc7a4f1f4
> +0,       1680,       1680,      240,      480, 0x549cf083
> +0,       1920,       1920,      240,      480, 0x468dead7
> +0,       2160,       2160,      240,      480, 0x7e6af748
> +0,       2400,       2400,      240,      480, 0x02f20456
> +0,       2640,       2640,      240,      480, 0xb9d5eb37
> +0,       2880,       2880,      240,      480, 0x008cee35
> +0,       3120,       3120,      240,      480, 0xdd13f6c0
> +0,       3360,       3360,      240,      480, 0xaa0df718
> +0,       3600,       3600,      240,      480, 0x0a84ee9c
> +0,       3840,       3840,      240,      480, 0xaccfed94
> +0,       4080,       4080,      240,      480, 0x65c7f1bf
> +0,       4320,       4320,      240,      480, 0xda8cebed
> +0,       4560,       4560,      240,      480, 0x0ea4f747
> +0,       4800,       4800,      240,      480, 0x0feee8a6
> +0,       5040,       5040,      240,      480, 0x65d0de7d
> +0,       5280,       5280,      240,      480, 0xc986f146
> +0,       5520,       5520,      240,      480, 0x7886f3f5
> +0,       5760,       5760,      240,      480, 0x39a6eda8
> +0,       6000,       6000,      240,      480, 0x636af0b0
> +0,       6240,       6240,      240,      480, 0xdd2bfec3
> +0,       6480,       6480,      240,      480, 0x1baddcc4
> +0,       6720,       6720,      240,      480, 0x12cbef82
> +0,       6960,       6960,      240,      480, 0xbd11ee44
> Test dss-lp failed. Look at tests/data/fate/dss-lp.err for details.
> make: *** [fate-dss-lp] Error 1
> make: *** Waiting for unfinished jobs....
> stddev:32798.91 PSNR:  6.01 MAXDIFF:46621 bytes:   327680/   327680
> stddev: |32798.91 - 0| >= 2
> Test amrwb-23k85 failed. Look at tests/data/fate/amrwb-23k85.err for details.
> make: *** [fate-amrwb-23k85] Error 1
> 
i am working on it.

> 
> also without explicitly specifying loongson:
> 
> ./configure --enable-gpl --enable-pthreads --samples=/home/loongson/fate/ --enable-version3 --assert-level=2
> ...
> ./libavutil/libm.h:162:76: error: static declaration of ‘round’ follows non-static declaration
>  static av_always_inline av_const double round(double x)
>                                                                             ^
> ./libavutil/libm.h:169:75: error: static declaration of ‘roundf’ follows non-static declaration
>  static av_always_inline av_const float roundf(float x)
>                                                                            ^
> ./libavutil/libm.h:176:76: error: static declaration of ‘trunc’ follows non-static declaration
>  static av_always_inline av_const double trunc(double x)
>                                                                             ^
> ./libavutil/libm.h:183:75: error: static declaration of ‘truncf’ follows non-static declaration
>  static av_always_inline av_const float truncf(float x)
>                                                                            ^
> make: *** [libavdevice/alldevices.o] Error 1
> 
> detection of round() failed with this:
> 
> /usr/bin/ld: /tmp/ffconf.S1BUH3UB.o: linking mips:isa32r2 module with previous mips:4000 modules
> /usr/bin/ld: failed to merge target specific data of file /tmp/ffconf.S1BUH3UB.o
> /tmp/ffconf.S1BUH3UB.o: In function `foo':
> ffconf.HgZd30xA.c:(.text+0x3c): undefined reference to `round'
> collect2: error: ld returned 1 exit status
> 
i will offer patch soon.
> 
> 
> > 
> > Signed-off-by: ZhouXiaoyong <zhouxiaoyong at loongson.cn>
> > ---
> >  configure | 8 ++++----
> >  1 file changed, 4 insertions(+), 4 deletions(-)
> > 
> > diff --git a/configure b/configure
> > index d3f23c8..0f79874 100755
> > --- a/configure
> > +++ b/configure
> > @@ -1577,6 +1577,9 @@ ARCH_EXT_LIST_MIPS="
> >      mipsdspr1
> >      mipsdspr2
> >      msa
> > +"
> > +
> > +ARCH_EXT_LIST_LOONGSON="
> >      loongson3
> 
> why would this be in a seperate list ?
> the various ARM variants are also not in seperate lists
> 
Loogson have developed more useful MMI(Multi Media Instruct), imgtec may call it vector instructs. in a long term, we will fill the ARCH_EXT_LIST_LOONGSON or ARCH_EXT_LIST_LOONGSON_SIMD with flags like MMX, AVX, SSE...
You may not known that various Loongson-3 CPU cores have more instructs than MIPS64R2, so a separated list is better.
By the way, ARCH_EXT_LIST_ARM is existed yet, isn't it.

1561 ARCH_EXT_LIST_ARM="
1562     armv5te
1563     armv6
1564     armv6t2
1565     armv8
1566     neon
1567     vfp
1568     vfpv3
1569     setend
1570 "



More information about the ffmpeg-devel mailing list