[FFmpeg-devel] [PATCH] libavcodec/hevcdsp: port SIMD idct functions from 32-bit.

Josh Dekker josh at itanimul.li
Tue Jan 12 14:24:07 EET 2021


Hi,

On 2021-01-08 21:36, Reimar.Doeffinger at gmx.de wrote:
> From: Reimar Döffinger <Reimar.Doeffinger at gmx.de>
> 
> Makes SIMD-optimized 8x8 and 16x16 idcts for 8 and 10 bit depth
> available on aarch64.
> For a UHD HDR (10 bit) sample video these were consuming the most time
> and this optimization reduced overall decode time from 19.4s to 16.4s,
> approximately 15% speedup.
> Test sample was the first 300 frames of "LG 4K HDR Demo - New York.ts",
> running on Apple M1.
> ---
>   libavcodec/aarch64/Makefile               |   2 +
>   libavcodec/aarch64/hevcdsp_idct_neon.S    | 426 ++++++++++++++++++++++
>   libavcodec/aarch64/hevcdsp_init_aarch64.c |  45 +++
>   libavcodec/hevcdsp.c                      |   2 +
>   libavcodec/hevcdsp.h                      |   1 +
>   5 files changed, 476 insertions(+)
>   create mode 100644 libavcodec/aarch64/hevcdsp_idct_neon.S
>   create mode 100644 libavcodec/aarch64/hevcdsp_init_aarch64.c
> 
> [...]

AS	libavcodec/aarch64/hevcdsp_idct_neon.o
libavcodec/aarch64/hevcdsp_idct_neon.S: Assembler messages:
libavcodec/aarch64/hevcdsp_idct_neon.S:418: Error: operand mismatch -- 
`mov v29.4S,v28.4S'
libavcodec/aarch64/hevcdsp_idct_neon.S:418: Info:    did you mean this?
libavcodec/aarch64/hevcdsp_idct_neon.S:418: Info:    	mov v29.8b, v28.8b
libavcodec/aarch64/hevcdsp_idct_neon.S:418: Info:    other valid variant(s):
libavcodec/aarch64/hevcdsp_idct_neon.S:418: Info:    	mov v29.16b, v28.16b
libavcodec/aarch64/hevcdsp_idct_neon.S:418: Error: operand mismatch -- 
`mov v29.4S,v28.4S'
libavcodec/aarch64/hevcdsp_idct_neon.S:418: Info:    did you mean this?
libavcodec/aarch64/hevcdsp_idct_neon.S:418: Info:    	mov v29.8b, v28.8b
libavcodec/aarch64/hevcdsp_idct_neon.S:418: Info:    other valid variant(s):
libavcodec/aarch64/hevcdsp_idct_neon.S:418: Info:    	mov v29.16b, v28.16b
libavcodec/aarch64/hevcdsp_idct_neon.S:418: Error: operand mismatch -- 
`mov v29.4S,v28.4S'
libavcodec/aarch64/hevcdsp_idct_neon.S:418: Info:    did you mean this?
libavcodec/aarch64/hevcdsp_idct_neon.S:418: Info:    	mov v29.8b, v28.8b
libavcodec/aarch64/hevcdsp_idct_neon.S:418: Info:    other valid variant(s):
libavcodec/aarch64/hevcdsp_idct_neon.S:418: Info:    	mov v29.16b, v28.16b
libavcodec/aarch64/hevcdsp_idct_neon.S:418: Error: operand mismatch -- 
`mov v29.4S,v28.4S'
libavcodec/aarch64/hevcdsp_idct_neon.S:418: Info:    did you mean this?
libavcodec/aarch64/hevcdsp_idct_neon.S:418: Info:    	mov v29.8b, v28.8b
libavcodec/aarch64/hevcdsp_idct_neon.S:418: Info:    other valid variant(s):
libavcodec/aarch64/hevcdsp_idct_neon.S:418: Info:    	mov v29.16b, v28.16b

This doesn't build on GNU assembler (GNU Binutils for Ubuntu) 2.34 
(aarch64). Thanks for porting this, I was in the process of writing HEVC
assembly (see my set on the ML) and would be interested to rebase this 
on top of that set.

-- 
Josh


More information about the ffmpeg-devel mailing list