[FFmpeg-devel] [PATCH] JPEG2000: SSE optimisation of DWT decoding
maxime taisant
maximetaisant at hotmail.fr
Thu Aug 10 23:03:43 EEST 2017
> From: Clément Bœsch <u at pkh.me>
>
> On Tue, Aug 08, 2017 at 09:09:44AM +0000, maxime taisant wrote:
> > From: Maxime Taisant <maximetaisant at hotmail.fr>
> >
> > Hi,
> >
> > Here is some SSE optimisations for the dwt function used to decode
> JPEG2000.
> > I tested this code by using the time command while reading a
> JPEG2000 encoded video with ffmpeg and, on average, I observed a
> 4.05% general improvement, and a 12.67% improvement on the dwt
> decoding part alone.
> > In the nasm code, you can notice that the SR1DFLOAT macro appear
> twice. One version is called in the nasm code by the HORSD macro
> and the other is called in the C code of the dwt function, I couldn't
> figure out a way to make only one macro.
> > I also couldn't figure out a good way to optimize the VER_SD part, so
> that is why I left it unchanged, with just a SSE-optimized version of
> the SR_1D_FLOAT function.
> >
> > Regards.
> >
> > ---
> > libavcodec/jpeg2000dwt.c | 21 +-
> > libavcodec/jpeg2000dwt.h | 6 +
> > libavcodec/x86/jpeg2000dsp.asm | 794
> ++++++++++++++++++++++++++++++++++++++
> > libavcodec/x86/jpeg2000dsp_init.c | 55 +++
> > 4 files changed, 863 insertions(+), 13 deletions(-)
> >
> > diff --git a/libavcodec/jpeg2000dwt.c b/libavcodec/jpeg2000dwt.c
> index
> > 55dd5e89b5..69c935980d 100644
> > --- a/libavcodec/jpeg2000dwt.c
> > +++ b/libavcodec/jpeg2000dwt.c
> > @@ -558,16 +558,19 @@ int ff_jpeg2000_dwt_init(DWTContext *s,
> int border[2][2],
> > }
> > switch (type) {
> > case FF_DWT97:
> > + dwt_decode = dwt_decode97_float;
> > s->f_linebuf = av_malloc_array((maxlen + 12), sizeof(*s-
> >f_linebuf));
> > if (!s->f_linebuf)
> > return AVERROR(ENOMEM);
> > break;
> > case FF_DWT97_INT:
> > + dwt_decode = dwt_decode97_int;
> > s->i_linebuf = av_malloc_array((maxlen + 12), sizeof(*s-
> >i_linebuf));
> > if (!s->i_linebuf)
> > return AVERROR(ENOMEM);
> > break;
> > case FF_DWT53:
> > + dwt_decode = dwt_decode53;
> > s->i_linebuf = av_malloc_array((maxlen + 6), sizeof(*s-
> >i_linebuf));
> > if (!s->i_linebuf)
> > return AVERROR(ENOMEM);
>
> Using globals is not acceptable, you need to fix that.
>
Yeah, I can't even remember why I did that... I will fix it.
Thank you.
More information about the ffmpeg-devel
mailing list