[FFmpeg-devel] r9017 breaks WMA decoding on Intel Macs

Michael Niedermayer michaelni
Wed May 30 23:29:09 CEST 2007


Hi

On Wed, May 30, 2007 at 02:07:19PM +0200, Guillaume POIRIER wrote:
> Hi,
> 
> On 5/30/07, Zuxy Meng <zuxy.meng at gmail.com> wrote:
> > 2007/5/30, Guillaume POIRIER <poirierg at gmail.com>:
> > > On 5/30/07, Trent Piepho <xyzzy at speakeasy.org> wrote:
> > > > > On Wed, 30 May 2007, Guillaume POIRIER wrote:
> > > > > On 5/29/07, Zuxy Meng <zuxy.meng at gmail.com> wrote:
> > > > > > These warnings comes from the assembler not the compiler about cases
> > > > > > like 16+(%esi). The FSF as treats this as equivalent to 16+0(esi) ==
> > > > > > 16(esi) (therefore the assumed 0). If the Apple as treats it
> > > > > > differently without even a warning then the result is catastrophic...
> > > > > >
> > > > > Linux:
> > > > >  1bd:   0f 28 02                movaps (%edx),%xmm0
> > > > >  1c0:   0f 28 19                movaps (%ecx),%xmm3
> > > > >  1c3:   0f 28 62 f0             movaps 0xfffffff0(%edx),%xmm4
> > > > >  1c7:   0f 28 79 10             movaps 0x10(%ecx),%xmm7
> > > > >
> > > > > 000001d7        movaps  (%ebx),%xmm0
> > > > > 000001da        movaps  (%edi),%xmm3
> > > > > 000001dd        movaps  0x00(%ebx),%xmm4
> > > > > 000001e1        movaps  0x00(%edi),%xmm7
> > > > >
> > > > > As you can clearly see, that damn OSX manage to loose the offset.
> > > > > Zuxy, do you know another syntax than the one you suggested, that
> > > > > wouldn't confuse OSX's assembler?
> > > >
> > > > Doesn't my patch fix this?  That would be the alternate syntax that doesn't
> > > > confuse the assembler.
> > >
> > > Yep, your fixed patch does fix the problem (I said that earlier BTW ;-) ).
> > > Now that we know where the problem comes from, I was just wondering if
> > > there wasn't a simpler, less-invasive way. (not that your patch is
> > > unbearably longer, but based on the analysis I made of the
> > > disassembled code, it leads to more code, so I'd expect your patch to
> > > be slower (that, off course, would have to be benchmarked).
> >
> > No it won't. Trent's patch is the correct and optimal way, giving gcc
> > more freedom in allocating general registers. I should have done this
> > in my original code but I was a bit too lazy and was concerned if too
> > many constraints would break gcc 2.95, while the fact is Trent's patch
> > compiles with gcc 2.95. So there isn't any doubt in the patch itself.
> 
> Ok, fine with me. Michael, do you think that the patch I posted
> earlier (100% based on Trent's, only fixing minor issues) should be
> applied?

well, after actually reading the code ... the loops should be written
in asm not by using for() / while() this will make the code faster
and it will make the n+%m code naturally dissapear

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

The educated differ from the uneducated as much as the living from the
dead. -- Aristotle 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20070530/8121ecfc/attachment.pgp>



More information about the ffmpeg-devel mailing list