[FFmpeg-devel] [PATCH v3 2/2] lavc/aarch64: h264, add chroma loop filters for 10bit

Martin Storsjö martin at martin.st
Sat Aug 21 00:11:12 EEST 2021


On Fri, 20 Aug 2021, Mikhail Nitenko wrote:

> Benchmarks:                                             A53     A72
> h264_h_loop_filter_chroma422_10bpp_c:                  282.7   114.2
> h264_h_loop_filter_chroma422_10bpp_neon:               109.5    78.5
> h264_h_loop_filter_chroma_10bpp_c:                     165.0    81.5
> h264_h_loop_filter_chroma_10bpp_neon:                  120.0    76.7
> h264_h_loop_filter_chroma_intra422_10bpp_c:            323.7   124.2
> h264_h_loop_filter_chroma_intra422_10bpp_neon:         155.0   102.7
> h264_h_loop_filter_chroma_intra_10bpp_c:               121.0    49.5
> h264_h_loop_filter_chroma_intra_10bpp_neon:             79.7    53.7
> h264_h_loop_filter_chroma_mbaff422_10bpp_c:            188.5    75.0
> h264_h_loop_filter_chroma_mbaff422_10bpp_neon:         120.0    75.5
> h264_h_loop_filter_chroma_mbaff_intra422_10bpp_c:      116.7    46.0
> h264_h_loop_filter_chroma_mbaff_intra422_10bpp_neon:    79.7    53.7
> h264_h_loop_filter_chroma_mbaff_intra_10bpp_c:          63.0    27.2
> h264_h_loop_filter_chroma_mbaff_intra_10bpp_neon:       48.5    34.0
> h264_v_loop_filter_chroma_10bpp_c:                     258.7   135.5
> h264_v_loop_filter_chroma_10bpp_neon:                   71.2    51.0
> h264_v_loop_filter_chroma_intra_10bpp_c:               158.0    70.7
> h264_v_loop_filter_chroma_intra_10bpp_neon:             48.7    31.5
>
> Signed-off-by: Mikhail Nitenko <mnitenko at gmail.com>
> ---


> +        uabd            v30.8h, v2.8h,  v0.8h   // abs(q1 - q0)
> +        cmhi            v26.8h, v22.8h, v26.8h  // < alpha
> +        cmhi            v28.8h, v23.8h, v28.8h  // < beta
> +        cmhi            v30.8h, v23.8h, v30.8h  // < beta
> +
> +        and             v26.16b, v26.16b, v28.16b

The columns above don't line up with the columns below

> +        mov             v4.16b, v0.16b
> +        sub             v4.8h,  v4.8h,  v16.8h
> +        and             v26.16b, v26.16b, v30.16b

Inconsistent alignment

> +        shl             v4.8h,  v4.8h,  #2
> +        mov             x8, v26.d[0]
> +        mov             x9, v26.d[1]
> +        sli             v24.8H, v24.8H, #8
> +        uxtl            v24.8H, v24.8B

Inconsistent style with capital letters for the vector structure

> +        add             v4.8h,  v4.8h,  v18.8h
> +        adds            x8,  x8,  x9
> +        shl             v24.8h, v24.8h,  #2
> +
> +        b.eq            9f
> +
> +        movi            v31.8h, #3              // (tc0 - 1) << (BIT_DEPTH - 8)) + 1
> +        uqsub           v24.8h, v24.8h, v31.8h
> +        sub             v4.8h , v4.8h,  v2.8h

Stray space before the comma

But content wise I guess the patch is fine, so I'll fix those nits and 
push it.

// Martin



More information about the ffmpeg-devel mailing list