[FFmpeg-devel] [PATCH 02/10] lavc/vp8dsp: R-V V put_bilin_h

Rémi Denis-Courmont remi at remlab.net
Sat May 4 21:06:32 EEST 2024


Le lauantaina 4. toukokuuta 2024, 21.02.25 EEST Rémi Denis-Courmont a écrit :
> Le lauantaina 4. toukokuuta 2024, 17.48.31 EEST uk7b at foxmail.com a écrit :
> > From: sunyuechi <sunyuechi at iscas.ac.cn>
> > 
> > C908:
> > vp8_put_bilin4_h_c: 373.5
> > vp8_put_bilin4_h_rvv_i32: 158.7
> > vp8_put_bilin8_h_c: 1437.7
> > vp8_put_bilin8_h_rvv_i32: 318.7
> > vp8_put_bilin16_h_c: 2845.7
> > vp8_put_bilin16_h_rvv_i32: 374.7
> > ---
> > 
> >  libavcodec/riscv/vp8dsp_init.c | 11 +++++++
> >  libavcodec/riscv/vp8dsp_rvv.S  | 54 ++++++++++++++++++++++++++++++++++
> >  2 files changed, 65 insertions(+)
> > 
> > diff --git a/libavcodec/riscv/vp8dsp_init.c
> > b/libavcodec/riscv/vp8dsp_init.c index c364de3dc9..32cb4893a4 100644
> > --- a/libavcodec/riscv/vp8dsp_init.c
> > +++ b/libavcodec/riscv/vp8dsp_init.c
> > @@ -34,6 +34,10 @@ VP8_EPEL(16, rvv);
> > 
> >  VP8_EPEL(8,  rvv);
> >  VP8_EPEL(4,  rvv);
> > 
> > +VP8_BILIN(16, rvv);
> > +VP8_BILIN(8,  rvv);
> > +VP8_BILIN(4,  rvv);
> > +
> > 
> >  av_cold void ff_vp78dsp_init_riscv(VP8DSPContext *c)
> >  {
> >  #if HAVE_RVV
> > 
> > @@ -47,6 +51,13 @@ av_cold void ff_vp78dsp_init_riscv(VP8DSPContext *c)
> > 
> >          c->put_vp8_bilinear_pixels_tab[0][0][0] =
> >          ff_put_vp8_pixels16_rvv;
> >          c->put_vp8_bilinear_pixels_tab[1][0][0] = ff_put_vp8_pixels8_rvv;
> >          c->put_vp8_bilinear_pixels_tab[2][0][0] = ff_put_vp8_pixels4_rvv;
> > 
> > +
> > +        c->put_vp8_bilinear_pixels_tab[0][0][1] =
> > ff_put_vp8_bilin16_h_rvv; +       
> > c->put_vp8_bilinear_pixels_tab[0][0][2] =
> > ff_put_vp8_bilin16_h_rvv; +        c->put_vp8_bilinear_pixels_tab[1][0][1]
> > = ff_put_vp8_bilin8_h_rvv; +       
> > c->put_vp8_bilinear_pixels_tab[1][0][2]
> > = ff_put_vp8_bilin8_h_rvv; +       
> > c->put_vp8_bilinear_pixels_tab[2][0][1]
> > = ff_put_vp8_bilin4_h_rvv; +       
> > c->put_vp8_bilinear_pixels_tab[2][0][2]
> > = ff_put_vp8_bilin4_h_rvv; }
> > 
> >  #endif
> >  }
> > 
> > diff --git a/libavcodec/riscv/vp8dsp_rvv.S b/libavcodec/riscv/vp8dsp_rvv.S
> > index 063ab7110c..c8d265e516 100644
> > --- a/libavcodec/riscv/vp8dsp_rvv.S
> > +++ b/libavcodec/riscv/vp8dsp_rvv.S
> > @@ -98,3 +98,57 @@ func ff_put_vp8_pixels4_rvv, zve32x
> > 
> >          vsetivli      zero, 4, e8, mf4, ta, ma
> >          put_vp8_pixels
> >  
> >  endfunc
> > 
> > +
> > +.macro bilin_h_load dst len
> > +.ifc \len,4
> > +        vsetivli        zero, 5, e8, mf2, ta, ma
> > +.elseif \len == 8
> > +        vsetivli        zero, 9, e8, m1, ta, ma
> > +.else
> > +        vsetivli        zero, 17, e8, m2, ta, ma
> > +.endif
> 
> It might be worth defining a pseudo-instruction macro in asm.S that would
> statically compute the minimal LMUL from just the AVL and SEW. Then we don't
> to repeat these if blocks times and again, we can just do:
> 
> vsetvlstatic \len + 1, e8
> 
> or something like that

On second thought, concealing the LMUL from the programmer is perhaps not the 
smartest idea, since it heavily constrains register allocation.

-- 
雷米‧德尼-库尔蒙
http://www.remlab.net/





More information about the ffmpeg-devel mailing list