[FFmpeg-devel] [PATCH v2] swscale/ppc: VSX-optimize yuv2rgb_full
Michael Niedermayer
michael at niedermayer.cc
Wed Mar 20 20:38:34 EET 2019
On Wed, Mar 20, 2019 at 04:06:45PM +0200, Lauri Kasanen wrote:
> ./ffmpeg -f lavfi -i yuvtestsrc=duration=1:size=1200x1440 \
> -s 1200x1440 -f null -vframes 100 -pix_fmt $i -nostats \
> -cpuflags 0 -v error -
>
> This uses 32-bit mul, so POWER8 only.
>
> The following output formats get about 4.5x speedup:
>
> rgb24
> 39980 UNITS in yuv2packed1, 32768 runs, 0 skips
> 8774 UNITS in yuv2packed1, 32768 runs, 0 skips
> bgr24
> 40069 UNITS in yuv2packed1, 32768 runs, 0 skips
> 8772 UNITS in yuv2packed1, 32766 runs, 2 skips
> rgba
> 39759 UNITS in yuv2packed1, 32768 runs, 0 skips
> 8681 UNITS in yuv2packed1, 32767 runs, 1 skips
> bgra
> 39729 UNITS in yuv2packed1, 32768 runs, 0 skips
> 8696 UNITS in yuv2packed1, 32766 runs, 2 skips
> argb
> 39766 UNITS in yuv2packed1, 32768 runs, 0 skips
> 8672 UNITS in yuv2packed1, 32766 runs, 2 skips
> bgra
> 39784 UNITS in yuv2packed1, 32768 runs, 0 skips
> 8659 UNITS in yuv2packed1, 32767 runs, 1 skips
>
> Signed-off-by: Lauri Kasanen <cand at gmx.com>
> ---
> libswscale/ppc/swscale_vsx.c | 291 ++++++++++++++++++++++++++++++++++++
> +++++++ 1 file changed, 291 insertions(+)
>
> v2: HAVE_POWER8 from ifdef to if
>
> diff --git a/libswscale/ppc/swscale_vsx.c b/libswscale/ppc/swscale_vsx.c
> index 01eb46c..062ab0d 100644
> --- a/libswscale/ppc/swscale_vsx.c
> +++ b/libswscale/ppc/swscale_vsx.c
> @@ -422,6 +422,248 @@ yuv2NBPSX(16, BE, 1, 16, int32_t)
> yuv2NBPSX(16, LE, 0, 16, int32_t)
> #endif
>
> +static av_always_inline void
> +yuv2rgb_full_1_vsx_template(SwsContext *c, const int16_t *buf0,
> + const int16_t *ubuf[2], const int16_t *vbuf[2],
> + const int16_t *abuf0, uint8_t *dest, int dstW,
> + int uvalpha, int y, enum AVPixelFormat target,
> + int hasAlpha)
> +{
> + const int16_t *ubuf0 = ubuf[0], *vbuf0 = vbuf[0];
> + const int16_t *ubuf1 = ubuf[1], *vbuf1 = vbuf[1];
> + vector int16_t vy, vu, vv, A = vec_splat_s16(0), tmp16;
> + vector int32_t vy32_l, vy32_r, vu32_l, vu32_r, vv32_l, vv32_r,
> tmp32, tmp32_2;
> + vector int32_t R_l, R_r, G_l, G_r, B_l, B_r;
error: corrupt patch at line 26
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
If a bugfix only changes things apparently unrelated to the bug with no
further explanation, that is a good sign that the bugfix is wrong.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: not available
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20190320/f5afe7b3/attachment.sig>
More information about the ffmpeg-devel
mailing list