[FFmpeg-devel] Patch: Inline asm fixes for Intel compiler on Windows

Michael Niedermayer michaelni at gmx.at
Sat Apr 5 13:55:33 CEST 2014


On Sat, Apr 05, 2014 at 02:17:41PM +1100, Matt Oliver wrote:
> Heres an additional patch that modifies some of the inline asm to make it
> work under icl.
> 
> One previous issue was a lea instruction that would not compile under 64b
> icl. It seems icl has some sort of conformance checks which fail due to a
> 32 bit lea operation on what it must assume is an address (which is why it
> only fails in 64b due to the mismatch in forcing a 32b value).
> 
> On a second look the lea is only being used to perform an add operation
> without modifying the flags register. This however is actually used to
> create a value that previously already existed in register (the code
> subtracts a value and then uses the lea to re-add it). So the lea can be
> removed and the original value can just be used instead. This requires an
> additional register to store the old value for a couple of instructions but
> based on where the code is used there doesnt appear to be much issue with
> that (as the removal of a add operation outweighs the extra register cost).
> 
> The function itself benches as being ~5.3% faster when tested over a 100
> million iterations on random data. To test the affect of the changed
> register usage I tested it within vp9s decode_coeffs_b_generic and
> vp5s vp5_parse_vector_adjustment. The performance gains where obviously
> less due to the rest of the function overhead (0.5%, 3.9% respectively) but
> there was no performance degradation and this new function compiles under
> 64b icl.
> 

> The only remaining inline asm still in master that does not compile under
> icl is BRANCHLESS_GET_CABAC in x86/cabac.h. So if no one has any objections
> I was going to move that to external asm.

i dont think that this will work out benchmark/speed wise



>  vp56_arith.h |   17 +++++++----------
>  1 file changed, 7 insertions(+), 10 deletions(-)
> 7e6ff58f6fdb0924a46b7a875467b8f66163685c  Remove-leal-op-to-fix-icl-inline-asm.patch
> From 5f53a57bc8fedcd221164c95846471fc033095df Mon Sep 17 00:00:00 2001
> From: Matt Oliver <protogonoi at gmail.com>
> Date: Sat, 5 Apr 2014 14:00:13 +1100
> Subject: [PATCH] Remove leal op to fix icl inline asm.
> 
> ---
>  libavcodec/x86/vp56_arith.h | 17 +++++++----------
>  1 file changed, 7 insertions(+), 10 deletions(-)
> 
> diff --git a/libavcodec/x86/vp56_arith.h b/libavcodec/x86/vp56_arith.h
> index e71dbf8..570885a 100644
> --- a/libavcodec/x86/vp56_arith.h
> +++ b/libavcodec/x86/vp56_arith.h
> @@ -28,25 +28,22 @@
>  #define vp56_rac_get_prob vp56_rac_get_prob
>  static av_always_inline int vp56_rac_get_prob(VP56RangeCoder *c, uint8_t prob)
>  {
> -    unsigned int code_word = vp56_rac_renorm(c);
> -    unsigned int high = c->high;
> -    unsigned int low = 1 + (((high - 1) * prob) >> 8);
> +    c->code_word = vp56_rac_renorm( c );
> +    unsigned int low = 1 + (((c->high - 1) * prob) >> 8);

declaration after statement

[...]

-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

The bravest are surely those who have the clearest vision
of what is before them, glory and danger alike, and yet
notwithstanding go out to meet it. -- Thucydides
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20140405/ffcfc361/attachment.asc>


More information about the ffmpeg-devel mailing list