[FFmpeg-devel] [PATCH] x86/vp9dsp: fix clobbering of xmm6 on IDCT sse2 functions
Ronald S. Bultje
rsbultje at gmail.com
Sun Feb 8 14:30:36 CET 2015
Hi,
On Sat, Feb 7, 2015 at 9:10 PM, James Almer <jamrial at gmail.com> wrote:
> On 07/02/15 11:08 PM, James Almer wrote:
> > On 07/02/15 11:05 PM, Ronald S. Bultje wrote:
> >> Hi,
> >>
> >> On Sat, Feb 7, 2015 at 8:33 PM, James Almer <jamrial at gmail.com> wrote:
> >>
> >>> Signed-off-by: James Almer <jamrial at gmail.com>
> >>> ---
> >>> libavcodec/x86/vp9itxfm.asm | 3 +++
> >>> 1 file changed, 3 insertions(+)
> >>>
> >>> diff --git a/libavcodec/x86/vp9itxfm.asm b/libavcodec/x86/vp9itxfm.asm
> >>> index 64859a0..bfe427f 100644
> >>> --- a/libavcodec/x86/vp9itxfm.asm
> >>> +++ b/libavcodec/x86/vp9itxfm.asm
> >>> @@ -407,6 +407,9 @@ IDCT_4x4_FN ssse3
> >>> %macro IADST4_FN 5
> >>> INIT_MMX %5
> >>> cglobal vp9_%1_%3_4x4_add, 3, 3, 6 + notcpuflag(ssse3), dst, stride,
> >>> block, eob
> >>> +%if WIN64 && notcpuflag(ssse3)
> >>> +WIN64_SPILL_XMM 7
> >>> +%endif
> >>> movdqa xmm5, [pd_8192]
> >>> mova m0, [blockq+ 0]
> >>> mova m1, [blockq+ 8]
> >>
> >>
> >> Ehw... Well... Crap... OK I guess. (Can't think of anything better.)
> >>
> >> Ronald
> >
> > We could use INIT_XMM and invert every register alias (xmm -> m; m ->
> mm).
> > I just didn't go with that (admittedly cleaner and less hacky) solution
> because it
> > was a bigger patch.
>
> Actually, scratch that. The VP9_*_1D functions are used all over the place.
> Probably too messy to change.
>
Right, I did consider that for a brief moment and then immediately dropped
it :). I think the patch is fine, real-world code has usually has small
tidbits of unprettiness in it, so I guess the code is slowly elevating to
real-world status.
Hurray?...
Ronald
More information about the ffmpeg-devel
mailing list