[FFmpeg-devel] [PATCH V4 1/2] libswscale/x86/yuv2rgb: Change inline assembly into nasm code

Fu, Ting ting.fu at intel.com
Mon Jan 6 09:32:49 EET 2020



> -----Original Message-----
> From: ffmpeg-devel <ffmpeg-devel-bounces at ffmpeg.org> On Behalf Of
> Michael Niedermayer
> Sent: Friday, January 3, 2020 04:36 PM
> To: FFmpeg development discussions and patches <ffmpeg-devel at ffmpeg.org>
> Subject: Re: [FFmpeg-devel] [PATCH V4 1/2] libswscale/x86/yuv2rgb: Change
> inline assembly into nasm code
> 
> On Fri, Jan 03, 2020 at 06:59:28AM +0000, Fu, Ting wrote:
> >
> >
> > > -----Original Message-----
> > > From: ffmpeg-devel <ffmpeg-devel-bounces at ffmpeg.org> On Behalf Of
> > > Michael Niedermayer
> > > Sent: Friday, December 27, 2019 07:38 PM
> > > To: FFmpeg development discussions and patches
> > > <ffmpeg-devel at ffmpeg.org>
> > > Subject: Re: [FFmpeg-devel] [PATCH V4 1/2] libswscale/x86/yuv2rgb:
> > > Change inline assembly into nasm code
> > >
> > > On Thu, Dec 19, 2019 at 11:35:51AM +0800, Ting Fu wrote:
> > > > Tested using this command:
> > > > ./ffmpeg -pix_fmt yuv420p -s 1920*1080 -i ArashRawYuv420.yuv \
> > > > -vcodec rawvideo -s 1920*1080 -pix_fmt rgb24 -f null /dev/null
> > > >
> > >
> > > > The fps increase from 151 to 389 on my local machine.
> > >
> > > Thats nice but why is there such a difference from changing the way
> > > the code is assembled ?
> > > This should definitly be explained more detailedly in the commit
> > > message
> > >
> > Hi, Michael
> >
> > The fps increasing means mmx compared to C code, not inline compared nasm
> one. I will remove it from the commit message next patch version.
> 
> please test apples against apples, a benchmark of the inline vs NASM code
> certainly cannot hurt. Testing unoptimized vs new optimized code is not
> interresting. Testing old optimized vs new optimized code is interresting
> 

Hi Michael,
As I tested, the nasm-style code is just has the same performance with inline assembly, which is 352 fps this time.
So, I just remove it from the commit.

> 
> >
> > >
> > > >
> > > > Signed-off-by: Ting Fu <ting.fu at intel.com>
> > > > ---
> > > >  libswscale/x86/Makefile           |   1 +
> > > >  libswscale/x86/swscale.c          |  16 +-
> > > >  libswscale/x86/yuv2rgb.c          |  81 +++---
> > > >  libswscale/x86/yuv2rgb_template.c | 441 ++++++------------------------
> > > >  libswscale/x86/yuv_2_rgb.asm      | 270 ++++++++++++++++++
> > > >  5 files changed, 395 insertions(+), 414 deletions(-)  create mode
[...]
> > >
> > > i would expect EXTERNAL_MMXEXT to imply EXTERNAL_MMX
> > >
> >
> > I was thinking the mmx-only processor. Under this circumstance, the mmx-
> only processor will not be accelerated. Should that be OK? Or it means I will be
> OK for not care much about old mmx-only processor in following patches?
> 
> no
> If MMXEXT implies MMX then MMX == (MMX || MMXEXT)
> 

Modified in new patch v5.

Thank you
Ting Fu

> [...]


More information about the ffmpeg-devel mailing list