[FFmpeg-devel] [PATCH] swscale/aarch64/rgb2rgb: add deinterleaveBytes neon implementation

Michael Niedermayer michael at niedermayer.cc
Sun Sep 1 16:37:52 EEST 2024


On Sun, Sep 01, 2024 at 12:51:48PM +0200, Ramiro Polla wrote:
> On Sat, Aug 31, 2024 at 10:40 PM Michael Niedermayer
> <michael at niedermayer.cc> wrote:
> > On Fri, Aug 30, 2024 at 08:56:55PM +0200, Ramiro Polla wrote:
> > >                                       A55               A76
> > > deinterleave_bytes_c:             70342.0           34497.5
> > > deinterleave_bytes_neon:          21594.5 ( 3.26x)   5535.2 ( 6.23x)
> > > deinterleave_bytes_aligned_c:     71340.8           34651.2
> > > deinterleave_bytes_aligned_neon:   8616.8 ( 8.28x)   3996.2 ( 8.67x)
> > > ---
> > >  libswscale/aarch64/rgb2rgb.c      |  4 ++
> > >  libswscale/aarch64/rgb2rgb_neon.S | 59 +++++++++++++++++++++++
> > >  tests/checkasm/sw_rgb.c           | 77 +++++++++++++++++++++++++++++++
> > >  3 files changed, 140 insertions(+)
> >
> > this breaks fate on x86-64
> >
> > Test checkasm-sw_rgb failed. Look at tests/data/fate/checkasm-sw_rgb.err for details.
> 
> The sse2/avx implementations of deinterleaveBytes use LOOP_NVXX_TO_UV,
> which checks for alignment on src (and can read unaligned data) but
> expects dst to be aligned. Should the unaligned versions of these
> functions be modified to support writing to unaligned data?

well, honestly the only oppinion i have is that the code shouldnt crash :)

thx

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Many things microsoft did are stupid, but not doing something just because
microsoft did it is even more stupid. If everything ms did were stupid they
would be bankrupt already.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 195 bytes
Desc: not available
URL: <https://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20240901/018c23a3/attachment.sig>


More information about the ffmpeg-devel mailing list