[FFmpeg-devel] [PATCH V7 1/2] libswscale/x86/yuv2rgb: Change inline assembly into nasm code

Michael Niedermayer michael at niedermayer.cc
Sun Jan 19 15:10:49 EET 2020


On Sun, Jan 19, 2020 at 02:49:21AM +0000, Fu, Ting wrote:
> 
> 
> > -----Original Message-----
> > From: ffmpeg-devel <ffmpeg-devel-bounces at ffmpeg.org> On Behalf Of
> > Michael Niedermayer
> > Sent: Friday, January 17, 2020 05:36 AM
> > To: FFmpeg development discussions and patches <ffmpeg-devel at ffmpeg.org>
> > Subject: Re: [FFmpeg-devel] [PATCH V7 1/2] libswscale/x86/yuv2rgb: Change
> > inline assembly into nasm code
> > 
> > On Thu, Jan 16, 2020 at 07:27:05AM +0000, Fu, Ting wrote:
> > >
> > >
> > > > -----Original Message-----
> > > > From: ffmpeg-devel <ffmpeg-devel-bounces at ffmpeg.org> On Behalf Of
> > > > Michael Niedermayer
> > > > Sent: Wednesday, January 15, 2020 05:55 AM
> > > > To: FFmpeg development discussions and patches
> > > > <ffmpeg-devel at ffmpeg.org>
> > > > Subject: Re: [FFmpeg-devel] [PATCH V7 1/2] libswscale/x86/yuv2rgb:
> > > > Change inline assembly into nasm code
> > > >
> > > > On Fri, Jan 10, 2020 at 01:38:15AM +0800, Ting Fu wrote:
> > > > > Signed-off-by: Ting Fu <ting.fu at intel.com>
> > > > > ---
> > > > > V7:
> > > > >     Fix compile issue when user configure with --disable-mmx.
> > > > >     Fix issue when running ./ffmpeg with --cpuflags mmx/ssse3.
> > > > >     Adjust the SIMD verify logic in libswscale/x86/yuv2rgb.c
> > > > >
> > > > >  libswscale/x86/Makefile           |   1 +
> > > > >  libswscale/x86/swscale.c          |  16 +-
> > > > >  libswscale/x86/yuv2rgb.c          |  66 ++---
> > > > >  libswscale/x86/yuv2rgb_template.c | 467 ++++++------------------------
> > > > >  libswscale/x86/yuv_2_rgb.asm      | 270 +++++++++++++++++
> > > > >  5 files changed, 405 insertions(+), 415 deletions(-)  create mode
> > > > > 100644 libswscale/x86/yuv_2_rgb.asm
> > > >
> > > > The commit message seems a bit terse I think it should say if the
> > > > sequence of instructions is unchanged and if it was benchmaked. If
> > > > its the same speed, when the code is run the commit message should
> > > > say that too
> > > >
> > > > the principle of this (inline -> nasm) is fine of course.
> > > >
> > > >
> > > [...]
> > > > > -static inline int RENAME(yuv420_rgb16)(SwsContext *c, const uint8_t
> > *src[],
> > > > > -                                       int srcStride[],
> > > > > -                                       int srcSliceY, int srcSliceH,
> > > > > -                                       uint8_t *dst[], int dstStride[])
> > > > > +static int RENAME(yuv420_rgb16)(SwsContext *c, const uint8_t *src[],
> > > > > +                                               int srcStride[],
> > > > > +                                               int srcSliceY, int srcSliceH,
> > > > > +                                               uint8_t *dst[],
> > > > > +int
> > > > > +dstStride[])
> > > >
> > > > maybe the removial of inline should be a seperate patch also there
> > > > is the question why these wraper functions exist These do change
> > > > from a a "free thing in inline asm" to a call overhead with C->NASM
> > > >
> > > Hi Michael,
> > >
> > > The wrapper functions initiate some variables and contain one 'for cycle'. The
> > variable initiation needs to access to the 'c->dstW', furthermore macro
> > SWS_MAX_ FILTER_SIZE is needed. Which means extra work and much more
> > NASM code.
> > > If you still prefer to do all the things in assembly, I can change from 'C->NASM'
> > to 'call NASM function directly' in another further patch( for current patch easier
> > to review).
> > > Or in my opinion, the cost in C->NASM can be ignored, and the initiation work
> > looks clearer in C, just let it be what it is now.
> > > What do you think?
> > 
> > it probably makes no sense if its hard to convert that code
> 
> Hi Michael,
> 
> You mean I still need to convert that code, did I get you right?

i dont think its needed if its complex


> Since NASM function will get only the address of SwsConext c ( in order to be compatible with yuv2rgb_c function in parameters), not the address of c->redDither nor the c->dstW. I have no way to get the value of c->dstW by using address offset. 
> Do you have any suggestion for solving that problem? 

maybe the offset to redDither could be the 2nd field of the struct or the
whole redDither block could be moved up

but only do this if the resulting asm code is reasonable.
The goal is to eliminate some ugly wraper funcions which call the asm
if the code to avoid that is more ugly then it is not a good idea

thx

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

The day soldiers stop bringing you their problems is the day you have stopped 
leading them. They have either lost confidence that you can help or concluded 
you do not care. Either case is a failure of leadership. - Colin Powell
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: not available
URL: <https://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20200119/bba2dae0/attachment.sig>


More information about the ffmpeg-devel mailing list