[FFmpeg-devel] r9017 breaks WMA decoding on Intel Macs
Thu May 31 13:22:45 CEST 2007
On Thu, May 31, 2007 at 01:05:04PM +0200, Michael Niedermayer wrote:
> > >> But IMHO, it's a bit pointless, because
> > >> whatever the speed figures may look like, we are comparing 1 solution
> > >> that appears to work by luck, and another that is more reliable. Speed
> > >> isn't what your patch is after.
> > >
> > > There is no luck in the old solution providing :
> > > - we tell gcc the memory we modify (may be using "memory" clobber).
> > > - we use a gas supporting the +(%reg) syntax
> > I disagree. Newer gas _do_ complain about the syntax, and Trent
> > already explained the shortcomings of current implementation, no need
> > for me to restate them here.
> put a "memory" on the clobber list and trents argument is gone
let me elaborate on this a little more
the "memory" is needed because SSE/MMX writes to more than just the
first float/FFTSample (yeah thats pretty much the purpose of MMX/SSE)
gcc is not aware of this, so EVERY solution which uses
"+m" / "=m" to write needs a "memory" clobber
now trents argument is solely based on the situation that there is no
"memory" clobber ...
to summarize the possible solutions
1. dont support ancient assemblers
2. use 123%4 notation (most incorrect syntax possible, and will silently
generate wrong code on all assemblers if you are unlucky)
3. add more "m" operands to avoid the offsets (might be slower, and might
fail on some gcc versions)
4. write the whole loop in asm
note, ALL solutions need a "memory" clobber (or some other nasty tricks)
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
I have often repented speaking, but never of holding my tongue.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Size: 189 bytes
Desc: not available
More information about the ffmpeg-devel