[FFmpeg-devel] [PATCH] x86/vp9dsp: fix clobbering of xmm6 on IDCT sse2 functions

James Almer jamrial at gmail.com
Sun Feb 8 03:10:37 CET 2015


On 07/02/15 11:08 PM, James Almer wrote:
> On 07/02/15 11:05 PM, Ronald S. Bultje wrote:
>> Hi,
>>
>> On Sat, Feb 7, 2015 at 8:33 PM, James Almer <jamrial at gmail.com> wrote:
>>
>>> Signed-off-by: James Almer <jamrial at gmail.com>
>>> ---
>>>  libavcodec/x86/vp9itxfm.asm | 3 +++
>>>  1 file changed, 3 insertions(+)
>>>
>>> diff --git a/libavcodec/x86/vp9itxfm.asm b/libavcodec/x86/vp9itxfm.asm
>>> index 64859a0..bfe427f 100644
>>> --- a/libavcodec/x86/vp9itxfm.asm
>>> +++ b/libavcodec/x86/vp9itxfm.asm
>>> @@ -407,6 +407,9 @@ IDCT_4x4_FN ssse3
>>>  %macro IADST4_FN 5
>>>  INIT_MMX %5
>>>  cglobal vp9_%1_%3_4x4_add, 3, 3, 6 + notcpuflag(ssse3), dst, stride,
>>> block, eob
>>> +%if WIN64 && notcpuflag(ssse3)
>>> +WIN64_SPILL_XMM 7
>>> +%endif
>>>      movdqa            xmm5, [pd_8192]
>>>      mova                m0, [blockq+ 0]
>>>      mova                m1, [blockq+ 8]
>>
>>
>> Ehw... Well... Crap... OK I guess. (Can't think of anything better.)
>>
>> Ronald
> 
> We could use INIT_XMM and invert every register alias (xmm -> m; m -> mm).
> I just didn't go with that (admittedly cleaner and less hacky) solution because it 
> was a bigger patch.

Actually, scratch that. The VP9_*_1D functions are used all over the place.
Probably too messy to change.


More information about the ffmpeg-devel mailing list