[FFmpeg-devel] [PATCH 0/5] Provide neon implementation for me_cmp functions

Hubert Mazur hum at semihalf.com
Tue Aug 16 15:20:11 EEST 2022


Add arm64 neon implementation for functions from motion estimation
family. All of them were tested and benchmarked using checkasm tool.
The rare code paths, e.g. when filter_size % 4 != 0 were also tested.
Instructions were manualy deinterleaved to reach best performance.

Hubert Mazur (5):
  lavc/aarch64: Add neon implementation for sse16
  lavc/aarch64: Add neon implementation for sse4
  lavc/aarch64: Add neon implementation for pix_abs16_y2
  lavc/aarch64: Add neon implementation for sse8
  lavc/aarch64: Add neon implementation for pix_abs8

 libavcodec/aarch64/me_cmp_init_aarch64.c |  18 ++
 libavcodec/aarch64/me_cmp_neon.S         | 324 +++++++++++++++++++++++
 2 files changed, 342 insertions(+)

-- 
2.34.1



More information about the ffmpeg-devel mailing list