[FFmpeg-devel] Possible cache use improvements

Uoti Urpala uoti.urpala
Sun Aug 10 20:34:25 CEST 2008

I wondered why using direct rendering in MPlayer gave no performance
improvement, even when testing with a large mpeg2 video where the
relative extra cost of copying should be significant. I determined that
the most probable cause was less efficient cache use by libavcodec in
the direct rendering case. When *not* using direct rendering, libavcodec
draws each macroblock row of B frames into the same area, then gives the
slice to MPlayer which copies it with fastmemcpy to its final
destination. When using direct rendering libavcodec decodes each
macroblock row directly to its real position in the frame. It seems that
always decoding a row to the same area and then copying with fastmemcpy
to the final destination is better than decoding directly to the final
destination (which is unlikely to be in cache).

The attached patch gave a more than 5% performance improvement in the
_overall_ CPU use of MPlayer in the test case (Athlon XP, 1280x720 mpeg2
video, vo xv, direct rendering). It changes mpeg2 to always decode a row
to the same place and then copy it with fastmemcpy to the real
destination; this avoids cache misses during the decoding and avoids
placing the picture being written in the cache (unlikely to be accessed
again before it's evicted). This version causes some image corruption
because it uses the top row of the image instead of a proper scratch
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mpeg2cache.diff
Type: text/x-patch
Size: 2452 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20080810/c4fec5a8/attachment.bin>

More information about the ffmpeg-devel mailing list