[FFmpeg-devel] [PATCH] x86: vc1dsp: Convert vc1_inv_trans_*_dc to NASM format
Timothy Gu
timothygu99 at gmail.com
Mon Feb 1 00:27:30 CET 2016
On Sun, Jan 31, 2016 at 06:18:53PM -0300, James Almer wrote:
> On 1/31/2016 4:48 PM, Timothy Gu wrote:
> > ---
> > libavcodec/x86/vc1dsp.asm | 104 ++++++++++++++++++++++
> > libavcodec/x86/vc1dsp_init.c | 13 +++
> > libavcodec/x86/vc1dsp_mmx.c | 207 -------------------------------------------
> > 3 files changed, 117 insertions(+), 207 deletions(-)
> >
> > diff --git a/libavcodec/x86/vc1dsp.asm b/libavcodec/x86/vc1dsp.asm
> > index 6415a83..f922927 100644
> > --- a/libavcodec/x86/vc1dsp.asm
> > +++ b/libavcodec/x86/vc1dsp.asm
> > @@ -395,3 +395,107 @@ cglobal vc1_put_ver_16b_shift2, 4,7,0, dst, src, stride
> > jnz .loop
> > REP_RET
> > %endif ; HAVE_MMX_INLINE
> > +
> > +%macro INV_TRANS_INIT 0
> > + movsxdifnidn linesizeq, linesized
>
> Maybe change the prototype so linesize is ptrdiff_t?
I wanted to do that at first, but then I realized that to change this I'd need
to change simple_idct and a bunch of other decoders. I do want to come back to
this, but that just seems too much work for just four functions =P
[...]
> > +; ff_vc1_inv_trans_?x?_dc_mmxext(uint8_t *dest, int linesize, int16_t *block)
> > +INIT_MMX mmxext
> > +cglobal vc1_inv_trans_4x4_dc, 3,4,0, dest, linesize, block
> > + movsx r3d, WORD [blockq]
>
> Can this value be negative?
I'm not 100% certain but I believe it can be.
> Because you're using it as an argument
> for lea using native size after movsx sign extended the value to 32
> bits, which means that on x86_64 the upper bits of the register will
> be zeroed.
>
> If it can you'll have to use blockq/r3q everywhere, and if it can't
> then use movzx and shr.
Changed locally to blockq/r3. I was emulating GCC's code generation but seems
like there isn't much difference.
Timothy
More information about the ffmpeg-devel
mailing list