[FFmpeg-devel] [PATCH 02/10] lavc/vp8dsp: R-V V put_bilin_h
Rémi Denis-Courmont
remi at remlab.net
Sat May 4 21:06:32 EEST 2024
Le lauantaina 4. toukokuuta 2024, 21.02.25 EEST Rémi Denis-Courmont a écrit :
> Le lauantaina 4. toukokuuta 2024, 17.48.31 EEST uk7b at foxmail.com a écrit :
> > From: sunyuechi <sunyuechi at iscas.ac.cn>
> >
> > C908:
> > vp8_put_bilin4_h_c: 373.5
> > vp8_put_bilin4_h_rvv_i32: 158.7
> > vp8_put_bilin8_h_c: 1437.7
> > vp8_put_bilin8_h_rvv_i32: 318.7
> > vp8_put_bilin16_h_c: 2845.7
> > vp8_put_bilin16_h_rvv_i32: 374.7
> > ---
> >
> > libavcodec/riscv/vp8dsp_init.c | 11 +++++++
> > libavcodec/riscv/vp8dsp_rvv.S | 54 ++++++++++++++++++++++++++++++++++
> > 2 files changed, 65 insertions(+)
> >
> > diff --git a/libavcodec/riscv/vp8dsp_init.c
> > b/libavcodec/riscv/vp8dsp_init.c index c364de3dc9..32cb4893a4 100644
> > --- a/libavcodec/riscv/vp8dsp_init.c
> > +++ b/libavcodec/riscv/vp8dsp_init.c
> > @@ -34,6 +34,10 @@ VP8_EPEL(16, rvv);
> >
> > VP8_EPEL(8, rvv);
> > VP8_EPEL(4, rvv);
> >
> > +VP8_BILIN(16, rvv);
> > +VP8_BILIN(8, rvv);
> > +VP8_BILIN(4, rvv);
> > +
> >
> > av_cold void ff_vp78dsp_init_riscv(VP8DSPContext *c)
> > {
> > #if HAVE_RVV
> >
> > @@ -47,6 +51,13 @@ av_cold void ff_vp78dsp_init_riscv(VP8DSPContext *c)
> >
> > c->put_vp8_bilinear_pixels_tab[0][0][0] =
> > ff_put_vp8_pixels16_rvv;
> > c->put_vp8_bilinear_pixels_tab[1][0][0] = ff_put_vp8_pixels8_rvv;
> > c->put_vp8_bilinear_pixels_tab[2][0][0] = ff_put_vp8_pixels4_rvv;
> >
> > +
> > + c->put_vp8_bilinear_pixels_tab[0][0][1] =
> > ff_put_vp8_bilin16_h_rvv; +
> > c->put_vp8_bilinear_pixels_tab[0][0][2] =
> > ff_put_vp8_bilin16_h_rvv; + c->put_vp8_bilinear_pixels_tab[1][0][1]
> > = ff_put_vp8_bilin8_h_rvv; +
> > c->put_vp8_bilinear_pixels_tab[1][0][2]
> > = ff_put_vp8_bilin8_h_rvv; +
> > c->put_vp8_bilinear_pixels_tab[2][0][1]
> > = ff_put_vp8_bilin4_h_rvv; +
> > c->put_vp8_bilinear_pixels_tab[2][0][2]
> > = ff_put_vp8_bilin4_h_rvv; }
> >
> > #endif
> > }
> >
> > diff --git a/libavcodec/riscv/vp8dsp_rvv.S b/libavcodec/riscv/vp8dsp_rvv.S
> > index 063ab7110c..c8d265e516 100644
> > --- a/libavcodec/riscv/vp8dsp_rvv.S
> > +++ b/libavcodec/riscv/vp8dsp_rvv.S
> > @@ -98,3 +98,57 @@ func ff_put_vp8_pixels4_rvv, zve32x
> >
> > vsetivli zero, 4, e8, mf4, ta, ma
> > put_vp8_pixels
> >
> > endfunc
> >
> > +
> > +.macro bilin_h_load dst len
> > +.ifc \len,4
> > + vsetivli zero, 5, e8, mf2, ta, ma
> > +.elseif \len == 8
> > + vsetivli zero, 9, e8, m1, ta, ma
> > +.else
> > + vsetivli zero, 17, e8, m2, ta, ma
> > +.endif
>
> It might be worth defining a pseudo-instruction macro in asm.S that would
> statically compute the minimal LMUL from just the AVL and SEW. Then we don't
> to repeat these if blocks times and again, we can just do:
>
> vsetvlstatic \len + 1, e8
>
> or something like that
On second thought, concealing the LMUL from the programmer is perhaps not the
smartest idea, since it heavily constrains register allocation.
--
雷米‧德尼-库尔蒙
http://www.remlab.net/
More information about the ffmpeg-devel
mailing list