[FFmpeg-devel] [PATCH 2/2] swscale/aarch64: add vscale specializations

Martin Storsjö martin at martin.st
Sun Apr 17 00:32:46 EEST 2022


On Fri, 15 Apr 2022, Swinney, Jonathan wrote:

> This commit adds new code paths for vscale when filterSize is 2, 4, or 8. By
> using specialized code with unrolling to match the filterSize we can improve
> performance.
>
> | (seconds)   | c6g   |       |       |
> | ------------| ----- | ----- | ----- |
> | filterSize  | 2     | 4     | 8     |
> | original    | 0.581 | 0.974 | 1.744 |
> | optimized   | 0.399 | 0.569 | 1.052 |
> | improvement | 31.1% | 41.6% | 39.7% |
>
> Signed-off-by: Jonathan Swinney <jswinney at amazon.com>
> ---
> libswscale/aarch64/output.S  | 147 +++++++++++++++++++++++++++++++++--
> libswscale/aarch64/swscale.c |  12 +++
> 2 files changed, 153 insertions(+), 6 deletions(-)

I'll have a closer look at the assembly itself at a later time, but first:

The checkasm tests in tests/checkasm/sw_scale.c does test yuv2planeX, but 
there's no testing of yuv2plane1, can you extend it to cover that too? And 
that existing test only tests filter sizes 1, 4, 8, 16, but apparently 
should be extended to test size 2 too?

// Martin



More information about the ffmpeg-devel mailing list