[FFmpeg-devel] [PATCH 08/11] avcodec/v210enc: add AVX-512 10-bit line pack function

Martin Vignali martin.vignali at gmail.com
Mon Nov 13 20:57:12 EET 2017


2017-11-10 22:13 GMT+01:00 James Darnley <jdarnley at obe.tv>:

> On 2017-11-10 14:32, James Darnley wrote:
> > I mentioned previously that using ZMM registers will cause the CPU to
> > reduce its frequency.
> >
> > Gramner said on IRC that a user should spend 20-30% of time in
> > AVX-512/ZMM code for it to be a net gain in speed.
> > From ffmpeg-devel IRC on 2017-10-26
> >> https://lists.ffmpeg.org/pipermail/ffmpeg-devel-irc/
> 2017-October/004622.html
> >> [18:49:26 CEST] <Gramner> J_Darnley: be aware that using zmm registers
> induces significant frequency drops which reduces performance of everything
> else, so if you want to use 512-bit vectors you better go all in on it to
> make up for it. you probably want to spend at least 20-30% of overall
> runtime in avx-512 code
> >> [18:50:00 CEST] <Gramner> the alternative is to stay in 256-bit mode
> and just leverage new instructions and opmasks
> >
> > This means any cycles you might save by using longer registers, fewer
> > instructions, better instructions, whatever, will be lost because the
> > frequency drops meaning it takes longer to execute overall.
>
> Some details about this can be found in one of Intel's documents: IntelĀ®
> 64 and IA-32 Architectures Optimization Reference Manual
> Order Number: 248966-038
> October 2017
> > https://software.intel.com/sites/default/files/managed/
> 9e/bc/64-ia-32-architectures-optimization-manual.pdf
> Specifically section 15.26 "SKYLAKE SERVER POWER MANAGEMENT"
>
> Earlier on the ffmpeg-devel IRC channel I posted a link to Cloudflare's
> blog in which they discuss the effects of running just a few (my words)
> AVX-512/ZMM instructions.
> > https://blog.cloudflare.com/on-the-dangers-of-intels-frequency-scaling/
>
> In the worst cases on some of the new processors the frequency drop can
> be 1GHz.  In Cloudflare's case just spending about 2.5% of time in a
> cryptography function using AVX-512 was causing a 10% drop in their
> overall performance (requests served per second).
>
> After seeing this and the discussion on IRC I won't commit any of the
> function patches.  The functions are not very impressive and are likely
> to make everything else slower.
>
> The IRC log should appear at the link below.
> > https://lists.ffmpeg.org/pipermail/ffmpeg-devel-irc/
> 2017-November/004651.html
>
>
> Thanks for the details explanations.

Martin


More information about the ffmpeg-devel mailing list