[FFmpeg-devel] [PATCH v4] libavcodec/riscv:add RVV optimized idct_32x32_8 for HEVC
Rémi Denis-Courmont
remi at remlab.net
Sat May 24 19:18:46 EEST 2025
Le tiistaina 20. toukokuuta 2025, 10.58.06 Itä-Euroopan kesäaika
daichengrong at iscas.ac.cn a écrit :
> From: daichengrong <daichengrong at iscas.ac.cn>
>
> Since there are no comments for v2 and v3, we have continued to optimize
> according to the comments of v1. We spilled the slide to memory to help
> improve performance,and optimized the extraction of elements from vector
> registers.
You still seem to be flip-flopping values in X registers. You may need to go
easier on macros to get a better view of the actual generated code.
Also it seems that this uses half-vectors a lot. I am not sure if this can be
avoided, but typically that leads to very poor performance.
Also you're resetting `vl` with its current value, which can hurt performance
depending on the implementation. If you don't need to change `vl`, then use
`zero`.
Lastly, you seem to be changing vtype when it's not actually needed, e.g.:
vsetvli zero, 4, e16, mf2...
...
vsetvli zero, 4, e32, mf1...
vse32.v ...
--
德尼-库尔蒙‧雷米
Hagalund ny stad, f.d. Finska republik Nylands
More information about the ffmpeg-devel
mailing list