[FFmpeg-devel] [PATCH] libavcodec/hevcdsp: port SIMD idct functions from 32-bit.

Reimar Döffinger Reimar.Doeffinger at gmx.de
Tue Jan 12 20:29:39 EET 2021



> On 12 Jan 2021, at 13:24, Josh Dekker <josh at itanimul.li> wrote:
> 
> Hi,
> 
> On 2021-01-08 21:36, Reimar.Doeffinger at gmx.de wrote:
>> From: Reimar Döffinger <Reimar.Doeffinger at gmx.de>
>> Makes SIMD-optimized 8x8 and 16x16 idcts for 8 and 10 bit depth
>> available on aarch64.
>> For a UHD HDR (10 bit) sample video these were consuming the most time
>> and this optimization reduced overall decode time from 19.4s to 16.4s,
>> approximately 15% speedup.
>> Test sample was the first 300 frames of "LG 4K HDR Demo - New York.ts",
>> running on Apple M1.
>> ---
>> libavcodec/aarch64/Makefile               |   2 +
>> libavcodec/aarch64/hevcdsp_idct_neon.S    | 426 ++++++++++++++++++++++
>> libavcodec/aarch64/hevcdsp_init_aarch64.c |  45 +++
>> libavcodec/hevcdsp.c                      |   2 +
>> libavcodec/hevcdsp.h                      |   1 +
>> 5 files changed, 476 insertions(+)
>> create mode 100644 libavcodec/aarch64/hevcdsp_idct_neon.S
>> create mode 100644 libavcodec/aarch64/hevcdsp_init_aarch64.c
>> [...]
> 
> AS	libavcodec/aarch64/hevcdsp_idct_neon.o
> libavcodec/aarch64/hevcdsp_idct_neon.S: Assembler messages:
> libavcodec/aarch64/hevcdsp_idct_neon.S:418: Error: operand mismatch -- `mov v29.4S,v28.4S'

Yes, I noticed that a few days ago, I sent the fixed version now.
I had only tested on Apple assembler, assuming it would be the same.
Really stupid behaviour by the GNU one, as if the type mattered for a mov instruction, needlessly complicates macros.

>> Thanks for porting this, I was in the process of writing HEVC
> assembly (see my set on the ML) and would be interested to rebase this on top of that set.

Sorry, I had not seen that as I’ve only recently started reading the list (well, only my threads to be honest).
Hope I’ve not duplicated/complicated any of your
work, I was mostly just interested in learning
something new, otherwise I would have checked first
for related work.

Thanks for the interest,
Reimar


More information about the ffmpeg-devel mailing list