[FFmpeg-devel] [PATCH v4] libavcodec/riscv:add RVV optimized idct_32x32_8 for HEVC

Rémi Denis-Courmont remi at remlab.net
Sat May 24 19:18:46 EEST 2025


Le tiistaina 20. toukokuuta 2025, 10.58.06 Itä-Euroopan kesäaika 
daichengrong at iscas.ac.cn a écrit :
> From: daichengrong <daichengrong at iscas.ac.cn>
> 
> Since there are no comments for v2 and v3, we have continued to optimize
> according to the comments of v1. We spilled the slide to memory to help
> improve performance,and optimized the extraction of elements from vector
> registers.

You still seem to be flip-flopping values in X registers. You may need to go 
easier on macros to get a better view of the actual generated code.

Also it seems that this uses half-vectors a lot. I am not sure if this can be 
avoided, but typically that leads to very poor performance.

Also you're resetting `vl` with its current value, which can hurt performance 
depending on the implementation. If you don't need to change `vl`, then use 
`zero`.

Lastly, you seem to be changing vtype when it's not actually needed, e.g.:

vsetvli zero, 4, e16, mf2...
...
vsetvli zero, 4, e32, mf1...
vse32.v ...




-- 
德尼-库尔蒙‧雷米
Hagalund ny stad, f.d. Finska republik Nylands





More information about the ffmpeg-devel mailing list