[MPlayer-dev-eng] [PATCH] xvidix threading
Howard Chu
hyc at highlandsun.com
Tue Sep 18 20:09:36 CEST 2007
(Resending after some cleanup...) I've been messing around with my Dvico
FusionHDTV tuner card and
running into performance problems. Playing standard-definition channels is
easy enough, but playing
a full 1920x1080i channel was just too much. Using xvidix/radeon for output...
oprofile showed that the bulk of the execution time is in memcpy, from
vidix_draw_slice:
CPU: AMD64 processors, speed 2600 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Cycles outside of halt state) with a unit
mask of 0x00 (No unit
mask) count 50000
samples % image name app name symbol name
179887 40.7671 mplayer mplayer fast_memcpy
49657 11.2536 mplayer mplayer swScale_MMX2
29557 6.6984 mplayer mplayer mmxext_idct
20101 4.5554 mplayer mplayer hcscale_MMX2
15602 3.5358 mplayer mplayer
slice_intra_DCT
15397 3.4894 mplayer mplayer
MC_put_o_16_mmxext
14941 3.3860 mplayer mplayer mpeg2_slice
11475 2.6005 mplayer mplayer
motion_fr_frame_420
9799 2.2207 mplayer mplayer
get_non_intra_block
CPU: AMD64 processors, speed 2600 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Cycles outside of halt state) with a unit
mask of 0x00 (No unit
mask) count 50000
samples % image name app name symbol name
-------------------------------------------------------------------------------
142 0.4817 mplayer mplayer
vidix_draw_slice_420
29337 99.5183 mplayer mplayer
fast_memcpy
29479 22.3475 mplayer mplayer
fast_memcpy_SSE
29479 100.000 mplayer mplayer
fast_memcpy_SSE [self]
-------------------------------------------------------------------------------
73 0.9778 mplayer mplayer
mpeg2_slice
7393 99.0222 mplayer mplayer
slice_non_intra_DCT
7466 5.6598 mplayer mplayer
get_non_intra_block
7466 100.000 mplayer mplayer
get_non_intra_block [self]
-------------------------------------------------------------------------------
I tried scaling it down to 960x540 which was enough to bring the CPU usage
down to around 95% or so,
but it was still maxing out occasionally. Since I've got a dual core processor
and the other core
was idle, I tried moving the draw_slice function into a separate thread. This
turned out to work
pretty well; instead of maxing out at 100% of one core I can now manage higher
resolutions without
any frame losses (using both cores). (I also tried to use the -dr option but
it only worked at a few
choices of image sizes. Not sure what the story is there.)
The attached diff (against current svn) is just for your consideration, for
real use it would need a
commandline switch to toggle it on/off. By the way, it seems to have some sync
problems when used
without the scale filter.
--
-- Howard Chu
Chief Architect, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: dif.txt
URL: <http://lists.mplayerhq.hu/pipermail/mplayer-dev-eng/attachments/20070918/879ad0e7/attachment.txt>
More information about the MPlayer-dev-eng
mailing list