[FFmpeg-devel] [PATCH 0/8] AVX-512 support (v.1)

James Darnley jdarnley at obe.tv
Mon Oct 30 15:08:27 EET 2017

This patch set adds support for AVX-512 functions to be written.  While not
immediately useful this does let people start writing them.

Presently it lumps a "more manageable" set of sub-features into the overall
AVX-512 flag.

Just to be clear: current processors severly limit the performace when executing
instructions on ZMM registers (512-bit).  Switching between the states also
takes some time.  Therefore functions that use them should be tested carefully
in "real-world" conditions to ensure that overall performance doesn't drop.

However, as Gramner points out in his x86inc commit, it provides an additional
16 registers for a total of 32 SIMD registers.  These can all be used in XMM
(128-bit) and YMM (256-bit) forms.  New instructions can also be used on these
smaller registers.

There are 2 commits here that I don't intend to be applied (now).  The first is
the alignment increase reported by avutil.  The second is the v210enc function,
it passes checkasm but it is not any faster.  It is there to show that all the
previous commits work correctly, namely: configure checks, cpuid detection,
x86inc changes, checkasm.

P.S.  I forgot to reword the commit message of "x86inc: reduce difference to
x264 upstream" to state what it does and why.  The smartalign directive is
documented here: http://www.nasm.us/xdoc/2.13.01/html/nasmdoc5.html#section-5.2

Henrik Gramner (1):
  x86inc: AVX-512 support

James Darnley (7):
  configure: test whether x86 assembler supports AVX-512
  avutil: add AVX-512 flags
  avutil: detect when AVX-512 is available
  avutil: add alignment needed for AVX-512
  x86inc: reduce difference to x264 upstream
  checkasm: support for AVX-512 functions
  avcodec/v210enc: add AVX-512 10-bit line pack function

 configure                     |   5 ++
 libavcodec/x86/v210enc.asm    |   5 ++
 libavcodec/x86/v210enc_init.c |   7 ++
 libavutil/cpu.c               |   6 +-
 libavutil/cpu.h               |   1 +
 libavutil/tests/cpu.c         |   1 +
 libavutil/x86/cpu.c           |  11 +++
 libavutil/x86/cpu.h           |   2 +
 libavutil/x86/x86inc.asm      | 188 ++++++++++++++++++++++++++++++++++--------
 tests/checkasm/checkasm.c     |   1 +
 10 files changed, 191 insertions(+), 36 deletions(-)


More information about the ffmpeg-devel mailing list