[FFmpeg-devel] Fix VP3 IDCT on Win64
Thu Aug 26 23:50:55 CEST 2010
On Thu, Aug 26, 2010 at 4:50 PM, Michael Niedermayer <michaelni at gmx.at>wrote:
> On Thu, Aug 26, 2010 at 08:44:06PM +0200, Reimar D?ffinger wrote:
> > On Thu, Aug 26, 2010 at 11:05:30AM +0000, Loren Merritt wrote:
> > > On Thu, 26 Aug 2010, Reimar D?ffinger wrote:
> > > >On Wed, Aug 25, 2010 at 08:43:25PM -0400, Ronald S. Bultje wrote:
> > > >>
> > > >>Those will stay inline of course. If an issue arises where we really
> > > >>need multiple (>6) XMM registers in inline functions (which I can
> > > >>honestly not imagine), then we'll think about a solution then and
> > > >>there.
> > > >
> > > >The solution is easy: only add the clobbers for compilers where they
> > > >are supported (I assume this was the issue on Win32/BSD? You never
> > > >said _what_ the problem was). This can be tested in configure.
> > > >And you'll have to specify the clobbers for inline functions even
> > > >for a single XMM register and even for Linux, it's just unreasonable
> > > >to hope that the compiler will never place some float stuff in a
> > > >bad location, particularly with global optimization enabled.
> > >
> > > Do you plan to add an emms at the end of every mmx function?
> > I think you should have no problem to come up with reasons why
> > this is not comparable.
> > But just in case
> > - "fixing" emms usage necessarily has a performance impact,
> > correct clobbers should not
> emms usage is correct and also in line with how others do it and how both
> intel and amd recommand it
> > - on most recent CPUs and on x86 in general, --disable-mmx
> > should "fix" the emms issue without too much of a performance
> > issue by just using SSE
> > - it is much less likely to be an issue since (most?) compilers
> > do not use MMX instructions and very rarely keep values stored
> > in the FPU (in contrast to SSE where they do both, even more so
> > on Win64).
> we dont mix float and mmx code without emms
I though the only issue was mixing x87 and mmx
> and before you start, we dont support compilers that create float asm out
> integer C and noone else (kernel comes to mind) will support such compiler
XORPS can be a useful instruction
More information about the ffmpeg-devel