[FFmpeg-devel] MPEG-2 FLAGS2_FAST benchmarks

Kieran Kunhya kierank at obe.tv
Sun Jun 14 03:33:14 EEST 2020


Hi,


> Problem 1
> you have a mean of around 2100 and Stdev of about between 55 and 80 so if
> by
> statistically significant you man 2 Stdev, then with the mean you have.
> You would declare every optimization of less than 6% to be statistically
> insignificant.
> So by what you say here, it seems to me you would have to suggest that
> every optimization which provides 6% or less overall speedup to be
> removed.
> That i doubt many will agree with
>

Logical fallacy.


> Problem 2
> We do not meassure speed this way because its not realiable nor practical
> just look at this, especially the difference and variation
> ./ffmpeg -threads 1 -i ~/videos/matrixbench_mpeg2.mpg -f null -
>    8941 decicycles in non-intra, 2097003 runs,    149 skipste=N/A
> speed=53.2x
>    8941 decicycles in non-intra, 2097013 runs,    139 skipste=N/A
> speed=54.1x
>    8942 decicycles in non-intra, 2097038 runs,    114 skipste=N/A
> speed=54.1x
>    8970 decicycles in non-intra, 2097037 runs,    115 skipste=N/A speed=
> 54x
>
> ./ffmpeg -threads 1 -flags2 fast -i ~/videos/matrixbench_mpeg2.mpg -f null
> -
>    8718 decicycles in non-intra, 2097020 runs,    132 skipste=N/A
> speed=54.6x
>    8701 decicycles in non-intra, 2097044 runs,    108 skipste=N/A
> speed=54.6x
>    8718 decicycles in non-intra, 2097034 runs,    118 skipste=N/A
> speed=54.5x
>    8702 decicycles in non-intra, 2097029 runs,    123 skipste=N/A
> speed=54.5x
>
> This difference is statistically significant, i can say this without the
> need
> to check
> Tested on a AMD Ryzen 9 3950X but i expect you will see similar on most
> CPUs
>

Did you remove the branch for FLAGS2_FAST and the large amount of inlined
code when making this measurement?
I also note that you just ignore mb_block_count in FLAGS2_FAST mode so this
is not a fair comparison.


> Problem 3
> You dont search for useless code, this is not a "i tested 100
> optimizations and found these not worth it"
> You search for an argument to remove specific pieces of my code.
> thats seriously not making sense to me. really not


I am not going to even try to respond to that as it clearly extends into
issues larger than FFmpeg.


> Problem 4
> try H264 using good old time
> time ./ffmpeg -thread_type slice -i fate-suite//h264/bbc2.sample.h264 -f
> null -
> real    0m0,252s
> real    0m0,254s
> real    0m0,254s
> real    0m0,255s
>
> time ./ffmpeg -flags2 fast  -thread_type slice -i
> fate-suite//h264/bbc2.sample.h264 -f null -
> real    0m0,217s
> real    0m0,220s
> real    0m0,218s
> real    0m0,217s
>
> Here even with a crude way of meassuring we can see a clear and strong
> difference
>

That's H.264, not MPEG-2, not relevant to this discussion. Is that file
even capable of using slice threading to decode?

Going back to the original point about MPEG-2, if a user chooses
FLAGS2_FAST, they expect it to make a major difference.
I had tested MPEG-2 expecting it to be much more significant but it is not.
Not 200 decicycles on a single function (making up your own dequant).

Kieran


More information about the ffmpeg-devel mailing list