[FFmpeg-devel] [PATCH] run level decode function for wma?&?wmapro

Sascha Sommer saschasommer
Sat Jun 20 09:56:19 CEST 2009


Hi,

> > > > Attached patch changes the code to use floats so that it can be
> > > > shared with wmapro (that reuses the coefficient decoding buffer as
> > > > output buffer)
> > >
> > > what effect does this have on speed on
> > > 1. normal desktop cpus
> > > 2. fpu less cpus (limited to ones on which wma is useable of course)
> > >
> > > [...]
> >
> > No idea about fpu less cpus. On my Pentium M 1.6 Ghz, gcc 4.2.1 and the
> > tc316_3.wmv sample, I get the following:
> >
> > int
> > 29150 dezicycles in rl_decode, 1 runs, 0 skips
> > 21955 dezicycles in rl_decode, 2 runs, 0 skips
> > 177420 dezicycles in rescale, 1 runs, 0 skips
> > 164600 dezicycles in rescale, 2 runs, 0 skips
> > 24440 dezicycles in rl_decode, 4 runs, 0 skips
> > 104967 dezicycles in rescale, 4 runs, 0 skips
> > 25856 dezicycles in rl_decode, 8 runs, 0 skips
> > 73722 dezicycles in rescale, 8 runs, 0 skips
> > 52556 dezicycles in rl_decode, 16 runs, 0 skips
> > 115481 dezicycles in rescale, 16 runs, 0 skips
> > 58356 dezicycles in rl_decode, 32 runs, 0 skips
> > 123329 dezicycles in rescale, 32 runs, 0 skips
> > 57448 dezicycles in rl_decode, 64 runs, 0 skips
> > 111245 dezicycles in rescale, 64 runs, 0 skips
> > 50067 dezicycles in rl_decode, 128 runs, 0 skips
> > 97719 dezicycles in rescale, 128 runs, 0 skips
> > 45674 dezicycles in rl_decode, 256 runs, 0 skips
> > 95783 dezicycles in rescale, 256 runs, 0 skips
> >
> > float
> > 20210 dezicycles in rl_decode, 1 runs, 0 skips
> > 14300 dezicycles in rl_decode, 2 runs, 0 skips
> > 145320 dezicycles in rescale, 1 runs, 0 skips
> > 137050 dezicycles in rescale, 2 runs, 0 skips
> > 20035 dezicycles in rl_decode, 4 runs, 0 skips
> > 87932 dezicycles in rescale, 4 runs, 0 skips
> > 22455 dezicycles in rl_decode, 8 runs, 0 skips
> > 62375 dezicycles in rescale, 8 runs, 0 skips
> > 44031 dezicycles in rl_decode, 16 runs, 0 skips
> > 94960 dezicycles in rescale, 16 runs, 0 skips
> > 49922 dezicycles in rl_decode, 32 runs, 0 skips
> > 102565 dezicycles in rescale, 32 runs, 0 skips
> > 49965 dezicycles in rl_decode, 64 runs, 0 skips
> > 93084 dezicycles in rescale, 64 runs, 0 skips
> > 44103 dezicycles in rl_decode, 128 runs, 0 skips
> > 82231 dezicycles in rescale, 128 runs, 0 skips
> > 40307 dezicycles in rl_decode, 256 runs, 0 skips
> > 81798 dezicycles in rescale, 256 runs, 0 skips
> >
> > int16
> > 20620 dezicycles in rl_decode, 1 runs, 0 skips
> > 15905 dezicycles in rl_decode, 2 runs, 0 skips
> > 147660 dezicycles in rescale, 1 runs, 0 skips
> > 140955 dezicycles in rescale, 2 runs, 0 skips
> > 25125 dezicycles in rl_decode, 4 runs, 0 skips
> > 91125 dezicycles in rescale, 4 runs, 0 skips
> > 24475 dezicycles in rl_decode, 8 runs, 0 skips
> > 64755 dezicycles in rescale, 8 runs, 0 skips
> > 43356 dezicycles in rl_decode, 16 runs, 0 skips
> > 91768 dezicycles in rescale, 16 runs, 0 skips
> > 47933 dezicycles in rl_decode, 32 runs, 0 skips
> > 96890 dezicycles in rescale, 32 runs, 0 skips
> > 47413 dezicycles in rl_decode, 64 runs, 0 skips
> > 87231 dezicycles in rescale, 64 runs, 0 skips
> > 41867 dezicycles in rl_decode, 128 runs, 0 skips
> > 76848 dezicycles in rescale, 128 runs, 0 skips
> > 38301 dezicycles in rl_decode, 256 runs, 0 skips
> > 75213 dezicycles in rescale, 256 runs, 0 skips
> >
> > The current code is the fastest but 16 bit are not enough for wmapro.
>
> could you change things to a COEF_TYPE or somthing that can be
> changed at compile time?
>
> [...]

Done. See the attached patch.

Regards

Sascha


-------------- next part --------------
A non-text attachment was scrubbed...
Name: shared_wma_rl_decode_try2_float2.patch
Type: text/x-diff
Size: 4281 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20090620/2fde6065/attachment.patch>



More information about the ffmpeg-devel mailing list