[FFmpeg-devel] Patch: Inline asm fixes for Intel compiler on Windows
Reimar.Doeffinger at gmx.de
Mon Dec 30 08:44:06 CET 2013
On 30.12.2013, at 03:06, Matt Oliver <protogonoi at gmail.com> wrote:
> On 30 December 2013 02:26, Michael Niedermayer <michaelni at gmx.at> wrote:
>> On Sun, Dec 29, 2013 at 02:19:40PM +0100, Reimar Döffinger wrote:
>>> On Sun, Dec 29, 2013 at 02:48:52PM +1100, Matt Oliver wrote:
>>>> is to much to ask. The only other option I can think of is to have
>>>> different versions. i.e. for those few lines with direct symbol
>>>> have an #ifdef that only changes the code for Intel compiler on
>>>> Windows shared libraries dont need PIC so this shouldn't be a problem.
>>>> Although the code may become a bit ugly.
>>> One thing I thought of: You could try using named asm constraints, and
>>> only change the MANGLE to convert to a reference to a named asm
>>> If you are lucky in quite a few cases you'd then only need an ifdef
>>> around the extra asm constraints.
>> that should be possible without ifdef, see XMM_CLOBBERS_ONLY() for
>> something similar
>> Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
>> You can kill me, but you cannot change the truth.
> Using named asm constraints is a good idea as it seems that Intel compiler
> supports those. So I will update the patch so that MANGLE remains in use
> but it will just change to a named asm constraint on Intel+Windows. Combine
> that with a macro such as XMM_CLOBBERS_ONLY in order to add the necessary
> asm interfaces and that should work fine.
> That one thing I did notice though is that if the use of "m" constraints
> with PIC is an issue then there are several locations in existing code that
> dont seem to follow that. For instance in libavcodec/x86/cavsdsp.c the
> macro QPEL_CAVSV1 uses a asm interface for MUL1 (which is variable ff_pw_96
> and is assigned to %5). However MUL2 (which is variable ff_pw_42) is used
> as a direct symbol through MANGLE. Since both these variables are defined
> in the same place and are of the same type (uint64_t) then it appears there
> are some inconsistencies in existing FFmpeg code. Ill leave all existing
> instances of MANGLE but i thought id just point this out.
I think you missed the reason it is an issue (though the difference between MUL1 and MUL2 is strange).
If the asm block uses only a few registers it's just a matter of balancing proper PIC code with textrels (which is good and you get if you use "m") with the risk of the compiler messing up and generating slow code and generally slower code if PIC is enabled.
However if your code already uses 6 registers, the "compiler messing up" suddenly means "doesn't even compile", which is a much bigger issue than the performance cost of a single pointless register load (which might in theory even be marginally faster if the address is used often).
Note then I suspect the CAVS code is probably not that well maintained, so I don't know if what you noticed is intentional or bad code.
More information about the ffmpeg-devel