[MPlayer-dev-eng] [PATCH] EOSD support for VDPAU
Grigori G
greg at chown.ath.cx
Sun Feb 22 17:04:06 CET 2009
Reimar Döffinger wrote:
> Do you have some actual benchmark numbers? Also trying to find a surface that
> matches closely in size instead of just matching somehow might greatly
> improve this without being much effort.
> I'd also like to get some idea how the perfomance is like compared to
> the vo gl method - if you do not want to implement the method just some
> -benchmark numbers for
>
> - gl with and without subtitles on
> - vdpau with and without subtitles and with subtitles and the POT code
> removed
>
> Would be good enough for now. Disabling vsync might give more useful
> numbers.
>
Some quick benchmarks show for one particular sample of 2 minutes:
(Each result is the best of three runs)
VDPAU
No subs:
BENCHMARKs: VC: 13.399s VO: 2.019s A: 0.000s Sys: 0.179s = 15.597s
BENCHMARK%: VC: 85.9062% VO: 12.9450% A: 0.0000% Sys: 1.1488% = 100.0000%
With POT:
BENCHMARKs: VC: 13.776s VO: 3.658s A: 0.000s Sys: 0.199s = 17.632s
BENCHMARK%: VC: 78.1288% VO: 20.7450% A: 0.0000% Sys: 1.1262% = 100.0000%
No POT:
BENCHMARKs: VC: 13.391s VO: 3.708s A: 0.000s Sys: 0.187s = 17.286s
BENCHMARK%: VC: 77.4659% VO: 21.4533% A: 0.0000% Sys: 1.0808% = 100.0000%
GL
No subs:
BENCHMARKs: VC: 13.868s VO: 2.068s A: 0.000s Sys: 0.193s = 16.129s
BENCHMARK%: VC: 85.9857% VO: 12.8202% A: 0.0000% Sys: 1.1941% = 100.0000%
Subs:
BENCHMARKs: VC: 14.093s VO: 3.467s A: 0.000s Sys: 0.199s = 17.758s
BENCHMARK%: VC: 79.3581% VO: 19.5226% A: 0.0000% Sys: 1.1193% = 100.0000%
So the difference seems to be nil in most cases. With more complex
styling, animations and so on, it is beneficial, but not much. (A few
percent speedup). I removed it now.
Anyway, I'm not satisfied with the performance in some cases, like in
http://samples.mplayerhq.hu/Matroska/subtitles/090128_gszs02.mkv -- a
huge "wall of text" is translated there, hundreds of glyphs and almost
2000 bitmaps to render. The current code badly chokes on this.
Doing a few simple tests I found out that
a) generate_eosd takes about 2/3 of the time spent on subtitle rendering
(on average)
b) putbits_native is *fast*. Skip it and there's no slowdown at all.
c) as already said, allocating surfaces is quite expensive
Thus I think an approach like in vo_gl might be a lot better. Something like
a) allocating a big surface at output_surface resolution (or more, POT
might be a better idea here)
b) fitting glyphs into this surface with a simple greedy algorithm
c) if it runs out of free space, a)
> Also I think that a dynamically allocated array for EOSD should be just
> a few modified lines, as I indicated below, so unless I missed something
> there is no excuse to not implement it the first time :-)
> Apart from that (and some additional comments below) the code does look quite good to me.
>
Yes, probably...
> if (!found) {
> j = eosd_max_surfaces;
> eosd_surfaces = realloc(eosd_surfaces, eosd_max_surfaces * sizeof(*eosd_surfaces));
> eosd_targets = realloc(eosd_targets, eosd_max_surfaces * sizeof(*eosd_targets));
> }
But is it really a good idea to realloc for each additional
surface/target that is required? That'll be a lot of reallocs sometimes.
With the new approach this will not be a problem...
>
>> + vdp_st = vdp_bitmap_surface_putbits_native(eosd_targets[eosd_render_count].surface,
>> + (const void *) &(i->bitmap), &(i->stride), &destRect);
>
> Those () are not needed.
>
OK.
>
> I don't mind, but this check should never be possible, so IMO you might
> as well remove it or use an assert instead.
>
It's a leftover from code re-arrangement. :)
Grigori
More information about the MPlayer-dev-eng
mailing list