[MPlayer-dev-eng] xvidix threading
Howard Chu
hyc at highlandsun.com
Sun Sep 16 16:39:12 CEST 2007
(Resending after subscribing...) I've been messing around with my Dvico FusionHDTV tuner card and
running into performance problems. Playing standard-definition channels is easy enough, but playing
a full 1920x1080i channel was just too much. Using xvidix/radeon for output...
oprofile showed that the bulk of the execution time is in memcpy, from vidix_draw_slice:
CPU: AMD64 processors, speed 2600 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Cycles outside of halt state) with a unit mask of 0x00 (No unit
mask) count 50000
samples % image name app name symbol name
179887 40.7671 mplayer mplayer fast_memcpy
49657 11.2536 mplayer mplayer swScale_MMX2
29557 6.6984 mplayer mplayer mmxext_idct
20101 4.5554 mplayer mplayer hcscale_MMX2
15602 3.5358 mplayer mplayer slice_intra_DCT
15397 3.4894 mplayer mplayer MC_put_o_16_mmxext
14941 3.3860 mplayer mplayer mpeg2_slice
11475 2.6005 mplayer mplayer motion_fr_frame_420
9799 2.2207 mplayer mplayer get_non_intra_block
CPU: AMD64 processors, speed 2600 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Cycles outside of halt state) with a unit mask of 0x00 (No unit
mask) count 50000
samples % image name app name symbol name
-------------------------------------------------------------------------------
142 0.4817 mplayer mplayer vidix_draw_slice_420
29337 99.5183 mplayer mplayer fast_memcpy
29479 22.3475 mplayer mplayer fast_memcpy_SSE
29479 100.000 mplayer mplayer fast_memcpy_SSE [self]
-------------------------------------------------------------------------------
73 0.9778 mplayer mplayer mpeg2_slice
7393 99.0222 mplayer mplayer slice_non_intra_DCT
7466 5.6598 mplayer mplayer get_non_intra_block
7466 100.000 mplayer mplayer get_non_intra_block [self]
-------------------------------------------------------------------------------
I tried scaling it down to 960x540 which was enough to bring the CPU usage down to around 95% or so,
but it was still maxing out occasionally. Since I've got a dual core processor and the other core
was idle, I tried moving the draw_slice function into a separate thread. This turned out to work
pretty well; instead of maxing out at 100% of one core I can now manage higher resolutions without
any frame losses (using both cores). (I also tried to use the -dr option but it only worked at a few
choices of image sizes. Not sure what the story is there.)
The attached diff (against current svn) is just for your consideration, for real use it would need a
commandline switch to toggle it on/off. By the way, it seems to have some sync problems when used
without the scale filter.
--
-- Howard Chu
Chief Architect, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: vidixdif.txt
URL: <http://lists.mplayerhq.hu/pipermail/mplayer-dev-eng/attachments/20070916/4c04af73/attachment.txt>
More information about the MPlayer-dev-eng
mailing list