[FFmpeg-devel] [PATCH] VP8 coeff decoding optimizations

Pascal Massimino pascal.massimino
Tue Aug 3 00:14:25 CEST 2010


On Mon, Aug 2, 2010 at 1:21 PM, Jason Garrett-Glaser
<darkshikari at gmail.com>wrote:

> 2010/8/2 M?ns Rullg?rd <mans at mansr.com>:
> > Jason Garrett-Glaser <darkshikari at gmail.com> writes:
> >
> >> On Mon, Aug 2, 2010 at 5:38 AM, Pascal Massimino
> >> <pascal.massimino at gmail.com> wrote:
> >>> Jason,
> >>>
> >>> On Mon, Aug 2, 2010 at 1:32 AM, Jason Garrett-Glaser
> >>> <darkshikari at gmail.com>wrote:
> >>>
> >>>> Attached are two mutually exclusive VP8 optimization patches.
> >>>>
> >>>> Approach in #1 (test.diff): simplify addressing by eliminating
> >>>> vp8_coeff_band
> >>>> Advantage: one less dereference, seems to be slightly faster, but
> >>>> might depend on the mood of gcc
> >>>>
> >>>
> >>> +1 here. Seems to be a tad faster than test3.diff (gcc 4.2.4 x86-64):
> >>>
> >>> current (timing decode_mb_coeffs()):
> >>> 47533 dezicycles in dec, 131005 runs, 67 skips
> >>> 47594 dezicycles in dec, 130977 runs, 95 skips
> >>> 47681 dezicycles in dec, 131003 runs, 69 skips
> >>> 47503 dezicycles in dec, 130997 runs, 75 skips
> >>>
> >>> test.diff
> >>> 46065 dezicycles in dec, 131004 runs, 68 skips
> >>> 46009 dezicycles in dec, 130996 runs, 76 skips
> >>> 46119 dezicycles in dec, 131035 runs, 37 skips
> >>> 46226 dezicycles in dec, 131000 runs, 72 skips
> >>>
> >>> test3.diff:
> >>> 46255 dezicycles in dec, 131003 runs, 69 skips
> >>> 46156 dezicycles in dec, 131009 runs, 63 skips
> >>> 46263 dezicycles in dec, 131017 runs, 55 skips
> >>
> >> Anyone want to bench on another arch (ARM)?
> >
> > Cortex-A8, gcc 4.3.3-cs2009q1:
> >
> > no patch
> > 1789 dezicycles in dec, 131059 runs, 13 skips
> > 1786 dezicycles in dec, 131069 runs, 3 skips
> > 1786 dezicycles in dec, 131069 runs, 3 skips
> >
> > test.diff
> > 1728 dezicycles in dec, 131065 runs, 7 skips
> > 1726 dezicycles in dec, 131064 runs, 8 skips
> > 1728 dezicycles in dec, 131067 runs, 5 skips
> >
> > test3.diff
> > 1780 dezicycles in dec, 131061 runs, 11 skips
> > 1780 dezicycles in dec, 131065 runs, 7 skips
> > 1784 dezicycles in dec, 131069 runs, 3 skips
> >
> > --
> > M?ns Rullg?rd
> > mans at mansr.com
> > _______________________________________________
> > ffmpeg-devel mailing list
> > ffmpeg-devel at mplayerhq.hu
> > https://lists.mplayerhq.hu/mailman/listinfo/ffmpeg-devel
> >
>
> I guess we now know which is better.  Any thoughts on whether I should
> improve the init code to be less stupid, or does that not matter?
>

not sure it really matters. That's a TODO(later) at best...


>
> Dark Shikari
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at mplayerhq.hu
> https://lists.mplayerhq.hu/mailman/listinfo/ffmpeg-devel
>



More information about the ffmpeg-devel mailing list