[FFmpeg-devel] [PATCH 5/5] aarch64: me_cmp: Don't do uaddlv once per iteration

Martin Storsjö martin at martin.st
Sat Jul 16 12:18:26 EEST 2022


On Fri, 15 Jul 2022, Michael Niedermayer wrote:

> On Fri, Jul 15, 2022 at 10:56:03PM +0300, Martin Storsjö wrote:
>> On Fri, 15 Jul 2022, Swinney, Jonathan wrote:
>>
>>> If the max height is just 16, then this should be fine. I assumed that h
>>> could have a much higher value (>1024), but if that is not the case,
>>> then this is a useful optimization.
>>
>> At least according to the me_cmp.h header, which says:
>>
>> /* Motion estimation:
>>  * h is limited to { width / 2, width, 2 * width },
>>  * but never larger than 16 and never smaller than 2.
>>  * Although currently h < 4 is not used as functions with
>>  * width < 8 are neither used nor implemented. */
>
> These rules where written with support for encoding of all
> standard formats in mind at the time that was written.
> today it may make sense to extend these rules to cover the
> things which where created since then

Also - if extending this, I would expect that you want other widths too. 
Right now, most of the functions seem to be arranged such as [0] is w=16 
and [0] is w=8. For those, for w=8, it seems to be mostly hardcoded to 
only assume h=8, while the w=16 functions actually honor the h parameter.

If it ever would be relevant with h>256, that wouldn't be for the existing 
w=8 or w=16 functions, but for newer functions with a larger width too.

So I think this patch is safe (which works for h up to 256), and if 
someone wants to extend the interface later, that can be done.

// Martin


More information about the ffmpeg-devel mailing list