[FFmpeg-devel] [PATCH] ppc: replace vec_ld(0), vec_ld(1) by VEC_LD2() which has fewer loads

Pavel Koshevoy pkoshevoy at gmail.com
Sat Nov 15 05:00:31 CET 2014


On 11/14/14 07:34, Michael Niedermayer wrote:
> On Fri, Nov 14, 2014 at 06:45:55AM -0700, Pavel Koshevoy wrote:
>> On Nov 13, 2014 4:15 PM, "Michael Niedermayer" <michaelni at gmx.at> wrote:
>>> On Fri, Nov 07, 2014 at 03:12:19PM +0100, Michael Niedermayer wrote:
>>>> This needs to be benchmarked, i do not have ppc hw
>>>> This is on big endian more similar to how the code was before
>> 79e0255956bc8fcdb143f39b2e45db77144ac017
>>>> Signed-off-by: Michael Niedermayer <michaelni at gmx.at>
>>> ping
>>>
>>> can someone with a altivec PPC please benchmark this
>>> or do all the ppc people want code to be slow and unoptimized ?
>>> iam also happy to benchmark it myself if someone provides a ppc or
>>> account on a altivec ppc that is reasonable idle so benchmarking is
>>> possible with some accuracy
>>>
>> I can do it over the weekend, I have a ppc G4 800MHz iMac.  I'll need
>> instructions on what to do for benchmarking.
> patch that adds benchmarking is below
> that and trying to decode some mpeg2 like with
>   -v 99 -i matrixbench_mpeg2.mpg -f null -
>
> should result in some timing values
> i cant say for sure though, as this does not work under qemu
> under qemu i just get 0
>
>
> diff --git a/libavcodec/mpegvideo_motion.c b/libavcodec/mpegvideo_motion.c
> index e7a585d..94b140d 100644
> --- a/libavcodec/mpegvideo_motion.c
> +++ b/libavcodec/mpegvideo_motion.c
> @@ -976,6 +976,7 @@ void ff_mpv_motion(MpegEncContext *s,
>                      op_pixels_func (*pix_op)[4],
>                      qpel_mc_func (*qpix_op)[16])
>   {
> +    START_TIMER
>   #if !CONFIG_SMALL
>       if (s->out_format == FMT_MPEG1)
>           mpv_motion_internal(s, dest_y, dest_cb, dest_cr, dir,
> @@ -984,4 +985,5 @@ void ff_mpv_motion(MpegEncContext *s,
>   #endif
>           mpv_motion_internal(s, dest_y, dest_cb, dest_cr, dir,
>                               ref_picture, pix_op, qpix_op, 0);
> +    STOP_TIMER("MC")
>   }
>
>

git am wouldn't apply the patches for me (I just saved the message from 
Thunderbird to .eml file and tried to feed that to git am). So, I had to 
trim them and use patch -p1 to apply manually.  The patch for 
util_altivec.h wouldn't apply so I patched manually.

I ran both builds twice and captured the output from the second run of 
each build, it's in the attachment.  By the looks of it there is no 
difference in performance.

If you'd like me to try something else -- I can try again tomorrow.

     Pavel.

-------------- next part --------------
$ ./ffmpeg -v 99 -i ~/Movies/matrixbench_mpeg2.mpg -f null - > /tmp/vec_ld.txt
ffmpeg version N-67669-g53ab784 Copyright (c) 2000-2014 the FFmpeg developers
  built on Nov 14 2014 20:14:18 with gcc 4.2.1 (GCC) (Apple Inc. build 5577)
  configuration: --prefix=/Developer/ppc --disable-debug --disable-shared --enable-swscale --enable-avfilter --enable-libmp3lame --enable-libvorbis --enable-libopus --enable-libtheora --enable-libschroedinger --enable-libopenjpeg --enable-libmodplug --enable-libvpx --enable-libspeex --enable-pthreads --enable-gpl --enable-version3 --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-postproc --enable-libx264 --enable-libxvid --enable-libass --enable-gnutls --enable-runtime-cpudetect --extra-cflags=-I/opt/local/include --extra-ldflags='-headerpad_max_install_names -L/opt/local/lib'
  libavutil      54. 11.100 / 54. 11.100
  libavcodec     56. 12.100 / 56. 12.100
  libavformat    56. 12.103 / 56. 12.103
  libavdevice    56.  3.100 / 56.  3.100
  libavfilter     5.  2.103 /  5.  2.103
  libswscale      3.  1.101 /  3.  1.101
  libswresample   1.  1.100 /  1.  1.100
  libpostproc    53.  3.100 / 53.  3.100
Splitting the commandline.
Reading option '-v' ... matched as option 'v' (set logging level) with argument '99'.
Reading option '-i' ... matched as input file with argument '/Users/pavel/Movies/matrixbench_mpeg2.mpg'.
Reading option '-f' ... matched as option 'f' (force format) with argument 'null'.
Reading option '-' ... matched as output file.
Finished splitting the commandline.
Parsing a group of options: global .
Applying option v (set logging level) with argument 99.
Successfully parsed a group of options.
Parsing a group of options: input file /Users/pavel/Movies/matrixbench_mpeg2.mpg.
Successfully parsed a group of options.
Opening an input file: /Users/pavel/Movies/matrixbench_mpeg2.mpg.
[mpeg @ 0x2808010] Format mpeg probed with size=2048 and score=26
[mpeg @ 0x2808010] Before avformat_find_stream_info() pos: 0 bytes read:32768 seeks:0
[mpeg @ 0x2808010] probing stream 1 pp:2500
[mpeg @ 0x2808010] Probe with size=2012, packets=1 detected mpegvideo with score=25
[mpeg @ 0x2808010] probed stream 1
[mpeg @ 0x2808010] max_analyze_duration 5000000 reached at 5000000 microseconds
[NULL @ 0x2809810] start time for stream 0 is not set in estimate_timings_from_pts
[mpeg @ 0x2808010] After avformat_find_stream_info() pos: 0 bytes read:4247696 seeks:3 frames:333
Input #0, mpeg, from '/Users/pavel/Movies/matrixbench_mpeg2.mpg':
  Duration: 00:03:07.66, start: 0.220000, bitrate: 5633 kb/s
    Stream #0:0[0x1bf], 0, 1/90000: Data: dvd_nav_packet, 1/90000
    Stream #0:1[0x1e0], 127, 1/90000: Video: mpeg2video (Main), yuv420p(tv, bt470bg/bt470m/bt470m, left), 720x576 [SAR 16:15 DAR 4:3], 1/50, max. 11421 kb/s, 25 fps, 25 tbr, 90k tbn, 50 tbc
    Stream #0:2[0x1c0], 206, 1/90000: Audio: mp2, 48000 Hz, stereo, s16p, 384 kb/s
Successfully opened the file.
Parsing a group of options: output file -.
Applying option f (force format) with argument null.
Successfully parsed a group of options.
Opening an output file: -.
Successfully opened the file.
detected 1 logical cores
[graph 0 input from stream 0:1 @ 0x2226b90] Setting 'video_size' to value '720x576'
[graph 0 input from stream 0:1 @ 0x2226b90] Setting 'pix_fmt' to value '0'
[graph 0 input from stream 0:1 @ 0x2226b90] Setting 'time_base' to value '1/90000'
[graph 0 input from stream 0:1 @ 0x2226b90] Setting 'pixel_aspect' to value '16/15'
[graph 0 input from stream 0:1 @ 0x2226b90] Setting 'sws_param' to value 'flags=2'
[graph 0 input from stream 0:1 @ 0x2226b90] Setting 'frame_rate' to value '25/1'
[graph 0 input from stream 0:1 @ 0x2226b90] w:720 h:576 pixfmt:yuv420p tb:1/90000 fr:25/1 sar:16/15 sws_param:flags=2
[AVFilterGraph @ 0x2227c10] query_formats: 3 queried, 2 merged, 0 already done, 0 delayed
[graph 1 input from stream 0:2 @ 0x2227590] Setting 'time_base' to value '1/48000'
[graph 1 input from stream 0:2 @ 0x2227590] Setting 'sample_rate' to value '48000'
[graph 1 input from stream 0:2 @ 0x2227590] Setting 'sample_fmt' to value 's16p'
[graph 1 input from stream 0:2 @ 0x2227590] Setting 'channel_layout' to value '0x3'
[graph 1 input from stream 0:2 @ 0x2227590] tb:1/48000 samplefmt:s16p samplerate:48000 chlayout:0x3
[audio format for output stream 0:1 @ 0x2227850] Setting 'sample_fmts' to value 's16'
[audio format for output stream 0:1 @ 0x2227850] auto-inserting filter 'auto-inserted resampler 0' between the filter 'Parsed_anull_0' and the filter 'audio format for output stream 0:1'
[AVFilterGraph @ 0x2227210] query_formats: 4 queried, 6 merged, 3 already done, 0 delayed
[auto-inserted resampler 0 @ 0x2227e00] ch:2 chl:stereo fmt:s16p r:48000Hz -> ch:2 chl:stereo fmt:s16 r:48000Hz
Output #0, null, to 'pipe:':
  Metadata:
    encoder         : Lavf56.12.103
    Stream #0:0, 0, 1/25: Video: rawvideo (I420 / 0x30323449), yuv420p(left), 720x576 [SAR 16:15 DAR 4:3], 1/25, q=2-31, 200 kb/s, 25 fps, 25 tbn, 25 tbc
    Metadata:
      encoder         : Lavc56.12.100 rawvideo
    Stream #0:1, 0, 1/48000: Audio: pcm_s16be, 48000 Hz, stereo, s16, 1536 kb/s
    Metadata:
      encoder         : Lavc56.12.100 pcm_s16be
Stream mapping:
  Stream #0:1 -> #0:0 (mpeg2video (native) -> rawvideo (native))
  Stream #0:2 -> #0:1 (mp2 (native) -> pcm_s16be (native))
Press [q] to stop, [?] for help
12340 UNITS in MC, 1 runs, 0 skips
7080 UNITS in MC, 2 runs, 0 skips
3947 UNITS in MC, 4 runs, 0 skips
2325 UNITS in MC, 8 runs, 0 skips
1489 UNITS in MC, 16 runs, 0 skips
1094 UNITS in MC, 32 runs, 0 skips
885 UNITS in MC, 64 runs, 0 skips
792 UNITS in MC, 127 runs, 1 skips
732 UNITS in MC, 255 runs, 1 skips
726 UNITS in MC, 511 runs, 1 skips
725 UNITS in MC, 1022 runs, 2 skips
[null @ 0x2864e10] Encoder did not produce proper pts, making some up.
698 UNITS in MC, 2045 runs, 3 skips
685 UNITS in MC, 4090 runs, 6 skips
684 UNITS in MC, 8183 runs, 9 skips
677 UNITS in MC, 16370 runs, 14 skips
683 UNITS in MC, 32745 runs, 23 skipsme=00:00:00.76 bitrate=N/A    
688 UNITS in MC, 65499 runs, 37 skips
687 UNITS in MC, 131009 runs, 63 skipse=00:00:03.12 bitrate=N/A    
681 UNITS in MC, 262023 runs, 121 skips=00:00:05.56 bitrate=N/A    
684 UNITS in MC, 524043 runs, 245 skips=00:00:11.68 bitrate=N/A    
677 UNITS in MC, 1048097 runs, 479 skips00:00:23.24 bitrate=N/A    
672 UNITS in MC, 2096195 runs, 957 skips00:00:48.96 bitrate=N/A    
681 UNITS in MC, 4192359 runs, 1945 skips0:01:40.88 bitrate=N/A    
[output stream 0:0 @ 0x2226ed0] EOF on sink link output stream 0:0:default.
[output stream 0:1 @ 0x22278e0] EOF on sink link output stream 0:1:default.
No more output streams to write to, finishing.
frame= 4690 fps= 64 q=0.0 Lsize=N/A time=00:03:07.65 bitrate=N/A    
video:293kB audio:35186kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: unknown
Input file #0 (/Users/pavel/Movies/matrixbench_mpeg2.mpg):
  Input stream #0:0 (data): 0 packets read (0 bytes); 
  Input stream #0:1 (video): 4690 packets read (119884725 bytes); 4690 frames decoded; 
  Input stream #0:2 (audio): 7819 packets read (9007488 bytes); 7819 frames decoded (9007488 samples); 
  Total: 12509 packets (128892213 bytes) demuxed
Output file #0 (pipe:):
  Output stream #0:0 (video): 0 frames encoded; 4690 packets muxed (300160 bytes); 
  Output stream #0:1 (audio): 7819 frames encoded (9007488 samples); 7819 packets muxed (36029952 bytes); 
  Total: 12509 packets (36330112 bytes) muxed
12510 frames successfully decoded, 0 decoding errors
[AVIOContext @ 0x2228500] Statistics: 136396944 bytes read, 3 seeks
-------------- next part --------------
$ ./ffmpeg -v 99 -i ~/Movies/matrixbench_mpeg2.mpg -f null -
ffmpeg version N-67669-g53ab784 Copyright (c) 2000-2014 the FFmpeg developers
  built on Nov 14 2014 20:14:18 with gcc 4.2.1 (GCC) (Apple Inc. build 5577)
  configuration: --prefix=/Developer/ppc --disable-debug --disable-shared --enable-swscale --enable-avfilter --enable-libmp3lame --enable-libvorbis --enable-libopus --enable-libtheora --enable-libschroedinger --enable-libopenjpeg --enable-libmodplug --enable-libvpx --enable-libspeex --enable-pthreads --enable-gpl --enable-version3 --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-postproc --enable-libx264 --enable-libxvid --enable-libass --enable-gnutls --enable-runtime-cpudetect --extra-cflags=-I/opt/local/include --extra-ldflags='-headerpad_max_install_names -L/opt/local/lib'
  libavutil      54. 11.100 / 54. 11.100
  libavcodec     56. 12.100 / 56. 12.100
  libavformat    56. 12.103 / 56. 12.103
  libavdevice    56.  3.100 / 56.  3.100
  libavfilter     5.  2.103 /  5.  2.103
  libswscale      3.  1.101 /  3.  1.101
  libswresample   1.  1.100 /  1.  1.100
  libpostproc    53.  3.100 / 53.  3.100
Splitting the commandline.
Reading option '-v' ... matched as option 'v' (set logging level) with argument '99'.
Reading option '-i' ... matched as input file with argument '/Users/pavel/Movies/matrixbench_mpeg2.mpg'.
Reading option '-f' ... matched as option 'f' (force format) with argument 'null'.
Reading option '-' ... matched as output file.
Finished splitting the commandline.
Parsing a group of options: global .
Applying option v (set logging level) with argument 99.
Successfully parsed a group of options.
Parsing a group of options: input file /Users/pavel/Movies/matrixbench_mpeg2.mpg.
Successfully parsed a group of options.
Opening an input file: /Users/pavel/Movies/matrixbench_mpeg2.mpg.
[mpeg @ 0x2808010] Format mpeg probed with size=2048 and score=26
[mpeg @ 0x2808010] Before avformat_find_stream_info() pos: 0 bytes read:32768 seeks:0
[mpeg @ 0x2808010] probing stream 1 pp:2500
[mpeg @ 0x2808010] Probe with size=2012, packets=1 detected mpegvideo with score=25
[mpeg @ 0x2808010] probed stream 1
[mpeg @ 0x2808010] max_analyze_duration 5000000 reached at 5000000 microseconds
[NULL @ 0x2809810] start time for stream 0 is not set in estimate_timings_from_pts
[mpeg @ 0x2808010] After avformat_find_stream_info() pos: 0 bytes read:4247696 seeks:3 frames:333
Input #0, mpeg, from '/Users/pavel/Movies/matrixbench_mpeg2.mpg':
  Duration: 00:03:07.66, start: 0.220000, bitrate: 5633 kb/s
    Stream #0:0[0x1bf], 0, 1/90000: Data: dvd_nav_packet, 1/90000
    Stream #0:1[0x1e0], 127, 1/90000: Video: mpeg2video (Main), yuv420p(tv, bt470bg/bt470m/bt470m, left), 720x576 [SAR 16:15 DAR 4:3], 1/50, max. 11421 kb/s, 25 fps, 25 tbr, 90k tbn, 50 tbc
    Stream #0:2[0x1c0], 206, 1/90000: Audio: mp2, 48000 Hz, stereo, s16p, 384 kb/s
Successfully opened the file.
Parsing a group of options: output file -.
Applying option f (force format) with argument null.
Successfully parsed a group of options.
Opening an output file: -.
Successfully opened the file.
detected 1 logical cores
[graph 0 input from stream 0:1 @ 0x2226b90] Setting 'video_size' to value '720x576'
[graph 0 input from stream 0:1 @ 0x2226b90] Setting 'pix_fmt' to value '0'
[graph 0 input from stream 0:1 @ 0x2226b90] Setting 'time_base' to value '1/90000'
[graph 0 input from stream 0:1 @ 0x2226b90] Setting 'pixel_aspect' to value '16/15'
[graph 0 input from stream 0:1 @ 0x2226b90] Setting 'sws_param' to value 'flags=2'
[graph 0 input from stream 0:1 @ 0x2226b90] Setting 'frame_rate' to value '25/1'
[graph 0 input from stream 0:1 @ 0x2226b90] w:720 h:576 pixfmt:yuv420p tb:1/90000 fr:25/1 sar:16/15 sws_param:flags=2
[AVFilterGraph @ 0x2227c10] query_formats: 3 queried, 2 merged, 0 already done, 0 delayed
[graph 1 input from stream 0:2 @ 0x2227590] Setting 'time_base' to value '1/48000'
[graph 1 input from stream 0:2 @ 0x2227590] Setting 'sample_rate' to value '48000'
[graph 1 input from stream 0:2 @ 0x2227590] Setting 'sample_fmt' to value 's16p'
[graph 1 input from stream 0:2 @ 0x2227590] Setting 'channel_layout' to value '0x3'
[graph 1 input from stream 0:2 @ 0x2227590] tb:1/48000 samplefmt:s16p samplerate:48000 chlayout:0x3
[audio format for output stream 0:1 @ 0x2227850] Setting 'sample_fmts' to value 's16'
[audio format for output stream 0:1 @ 0x2227850] auto-inserting filter 'auto-inserted resampler 0' between the filter 'Parsed_anull_0' and the filter 'audio format for output stream 0:1'
[AVFilterGraph @ 0x2227210] query_formats: 4 queried, 6 merged, 3 already done, 0 delayed
[auto-inserted resampler 0 @ 0x2227e00] ch:2 chl:stereo fmt:s16p r:48000Hz -> ch:2 chl:stereo fmt:s16 r:48000Hz
Output #0, null, to 'pipe:':
  Metadata:
    encoder         : Lavf56.12.103
    Stream #0:0, 0, 1/25: Video: rawvideo (I420 / 0x30323449), yuv420p(left), 720x576 [SAR 16:15 DAR 4:3], 1/25, q=2-31, 200 kb/s, 25 fps, 25 tbn, 25 tbc
    Metadata:
      encoder         : Lavc56.12.100 rawvideo
    Stream #0:1, 0, 1/48000: Audio: pcm_s16be, 48000 Hz, stereo, s16, 1536 kb/s
    Metadata:
      encoder         : Lavc56.12.100 pcm_s16be
Stream mapping:
  Stream #0:1 -> #0:0 (mpeg2video (native) -> rawvideo (native))
  Stream #0:2 -> #0:1 (mp2 (native) -> pcm_s16be (native))
Press [q] to stop, [?] for help
12660 UNITS in MC, 1 runs, 0 skips
7285 UNITS in MC, 2 runs, 0 skips
4090 UNITS in MC, 4 runs, 0 skips
2411 UNITS in MC, 8 runs, 0 skips
1553 UNITS in MC, 16 runs, 0 skips
1094 UNITS in MC, 32 runs, 0 skips
894 UNITS in MC, 64 runs, 0 skips
780 UNITS in MC, 128 runs, 0 skips
754 UNITS in MC, 255 runs, 1 skips
732 UNITS in MC, 511 runs, 1 skips
733 UNITS in MC, 1023 runs, 1 skips
[null @ 0x2864e10] Encoder did not produce proper pts, making some up.
694 UNITS in MC, 2046 runs, 2 skips
672 UNITS in MC, 4093 runs, 3 skips
671 UNITS in MC, 8186 runs, 6 skips
662 UNITS in MC, 16376 runs, 8 skips
669 UNITS in MC, 32753 runs, 15 skips
674 UNITS in MC, 65508 runs, 28 skipsme=00:00:00.80 bitrate=N/A    
673 UNITS in MC, 131016 runs, 56 skipse=00:00:03.16 bitrate=N/A    
667 UNITS in MC, 262026 runs, 118 skips=00:00:05.64 bitrate=N/A    
671 UNITS in MC, 524044 runs, 244 skips=00:00:11.76 bitrate=N/A    
665 UNITS in MC, 1048075 runs, 501 skips00:00:23.44 bitrate=N/A    
660 UNITS in MC, 2096165 runs, 987 skips00:00:48.04 bitrate=N/A    
668 UNITS in MC, 4192326 runs, 1978 skips0:01:40.16 bitrate=N/A    
[output stream 0:0 @ 0x2226ed0] EOF on sink link output stream 0:0:default.
[output stream 0:1 @ 0x22278e0] EOF on sink link output stream 0:1:default.
No more output streams to write to, finishing.
frame= 4690 fps= 64 q=0.0 Lsize=N/A time=00:03:07.65 bitrate=N/A    
video:293kB audio:35186kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: unknown
Input file #0 (/Users/pavel/Movies/matrixbench_mpeg2.mpg):
  Input stream #0:0 (data): 0 packets read (0 bytes); 
  Input stream #0:1 (video): 4690 packets read (119884725 bytes); 4690 frames decoded; 
  Input stream #0:2 (audio): 7819 packets read (9007488 bytes); 7819 frames decoded (9007488 samples); 
  Total: 12509 packets (128892213 bytes) demuxed
Output file #0 (pipe:):
  Output stream #0:0 (video): 0 frames encoded; 4690 packets muxed (300160 bytes); 
  Output stream #0:1 (audio): 7819 frames encoded (9007488 samples); 7819 packets muxed (36029952 bytes); 
  Total: 12509 packets (36330112 bytes) muxed
12510 frames successfully decoded, 0 decoding errors
[AVIOContext @ 0x2228500] Statistics: 136396944 bytes read, 3 seeks


More information about the ffmpeg-devel mailing list