[FFmpeg-devel] [RFC] use ff_avc_find_startcode in ff_find_start_code

Michael Niedermayer michaelni
Tue Feb 19 15:22:21 CET 2008


On Tue, Feb 19, 2008 at 01:30:53PM +0100, Reimar D?ffinger wrote:
> Hello,
> On Tue, Feb 19, 2008 at 10:22:09AM +0100, Reimar D?ffinger wrote:
> > On Tue, Feb 19, 2008 at 03:57:07AM +0100, Michael Niedermayer wrote:
> > > On Mon, Feb 18, 2008 at 11:07:38PM +0100, Reimar D?ffinger wrote:
> > > > First I wonder if that ff_find_start_code function is
> > > > not quite buggy anyway, or is it intentional that it searches for 
> > > > 00 00 01 00 in the part involving the state but for 00 00 01 ?? in the
> > > > later code? If so, could somebody document the code?.
> > > > Anyway, this is a quite ugly patch that makes the function use
> > > > ff_avc_find_startcode (since that is in lavf, it can't be used as is of
> > > > course).
> > > > It probably also breaks the original use of ff_avc_find_startcode,
> > > > though I found the current behaviour a bit strange as well, and this
> > > > function is undocumented, too.
> > > > This causes at least a 6% speedup when decoding
> > > > http://samples.mplayerhq.hu/GXF/THX_Science_FLT_1920.gxf (I only tested
> > > > with MPlayer though).
> > > 
> > > current code:
> > > 7054 dezicycles in findsc, 262131 runs, 13 skips
> > > 7082 dezicycles in findsc, 262134 runs, 10 skips
> > > 
> > > your code:
> > > 11371 dezicycles in findsc, 262119 runs, 25 skips
> > > 11624 dezicycles in findsc, 262115 runs, 29 skips
> > > 
> > > gcc: 4.3.0 20080127 (experimental)
> > > 800mhz duron
> > > ffmpeg -v 9 -i matrixbench_mpeg2.mpg -vcodec copy -an -y test.avi
> > 
> > Please use MPlayer, for some reason it gives these numbers:
> > current code:
> > 1039 dezicycles in test, 4096 runs, 0 skips
> > 1031 dezicycles in test, 8192 runs, 0 skips
> > 1022 dezicycles in test, 16384 runs, 0 skips
> > 
> > my code:
> > 623 dezicycles in test, 4096 runs, 0 skips
> > 624 dezicycles in test, 8192 runs, 0 skips
> > 631 dezicycles in test, 16384 runs, 0 skips
> 
> I think I found the reason for the discrepancies, the current code seems
> about 25% faster with the parser, whereas the decoder is about the same
> amount slower...
> Can someone help me find out why, or should we just use two different
> implementations?

As ive already said, the decoder does only search 3 bytes, and what you print
above is a 40 cpu cycles difference, first i dont understand what causes that,
second 40 cycles per slice, 36 slices in 576 lines and 25fps are 36k cycles
per second. That is just 0.0072% on a 500mhz system. This has absolutely no
chance of causing any meassureable difference.

Also the 3minute matrixbench_mpeg2 has 3*60*25*36 slices, that are 162000
your numbers of 16384 runs looks very strange. (not to mention 40 cycles
in a code run 16384 times does not matter ...)
Still it would be interresting to know why theres a 40cycle difference ...

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

it is not once nor twice but times without number that the same ideas make
their appearance in the world. -- Aristotle
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20080219/0e37271a/attachment.pgp>



More information about the ffmpeg-devel mailing list