[Mplayer-dev-eng] libavcodec speed

Felix Buenemann atmosfear at users.sourceforge.net
Fri Aug 3 06:58:17 CEST 2001


On Friday,  3. August 2001 02:31, you wrote:
> Hi,
>
> Just made some tricky benchmarks to check what is still slow in it.
> I have really fast system now (today upgraded to celeron2-800 at 1066, 133mhz
> fsb) but it's still not fast enough. I thought it's memory bandwith/speed
> problem, but how divx4 solves it then?
>
> Now, here are some results.
> I tested 3 variants, 1 with full divx decoding (perfect picture), 1 with
> disbaled motion compensation code (buggy shit :)) and oen with disabled dct
> too (parsing only -> empty picture)
>
> File:  /3d/divx/sample.light.it.up.avi
> VIDEO:  [DIV3]  640x352  24bpp  23.98 fps  823.5 kbps (100.5 kbyte/s)
> full      : 10.272s 10.290s 10.257s
> no MC     : 8.087s 8.083s 8.090s     =>  78.7%
> no MC+DCT : 3.322s 3.327s 3.314s     =>  32.34%
> MC: 21.3%  DCT: 46.4%  PARSER: 32.3%
>
> File:  /3d/divx/405divx_sm_v2[1].avi
> VIDEO:  [DIV3]  356x240  24bpp  30.00 fps  343.0 kbps (41.9 kbyte/s)
> full      : 10.208s 10.129s 10.137s
> no MC     : 7.485s 7.461s 7.493s     => 73.8%
> no MC+DCT : 3.972s 3.988s 3.980s     => 39.25%
> MC: 26.2%  DCT: 34.55%  PARSER: 39.25%
>
> File:  /3d/divx/Coyote.Ugly.Sample-highbitrate-atmos.avi
> VIDEO:  [DIV3]  640x480  24bpp  25.00 fps  1867.1 kbps (227.9 kbyte/s)
> full      : 15.425s 15.409s 15.312s
> no MC     : 13.056s 13.065s 13.029s  => 84.8%
> no MC+DCT : 5.731s 5.759s 5.774s     => 37.3%
> MC: 15.2%  DCT: 47.5%  PARSER: 37.3%
>
> As you can see, MC (the biggest unaligned memory user) uses the
> less cpu time. DCT is too slow, and parser is slow too.
> I think parser can be faster if compiled with intel c compiler, or
> optimizing at high-level just like i did with opendivx's postprocess.c.
> I think parser should be much faster than now. Maybe we should add
> some inline assembly getbit functions.
> But why is DCT so slow? Is it so hard to calculate? Or is it unoptimized?
it is IMHO. Because i386 dir contains asm forward dct, which is used for mpeh 
encoding but not decoding. We need optimized inverse discrete cosine 
transform in that place. An optimized idct version of ntels application note 
922 which is also used for the fdct in libavcodec can be found at
http://elecard.com/peter/idct.shtml the at&t version of it is IMHO included 
in libmpeg2.
>
> What about using 'prefetch' instruction? It could help at some point.
>
>
> A'rpi / Astral & ESP-team
>
-- 
Best Regards,
   Felix

_______________________________________________
Mplayer-dev-eng mailing list
Mplayer-dev-eng at lists.sourceforge.net
http://lists.sourceforge.net/lists/listinfo/mplayer-dev-eng



More information about the MPlayer-dev-eng mailing list