[FFmpeg-cvslog] r25254 - trunk/libavcodec/x86/h264dsp_mmx.c
Michael Niedermayer
michaelni
Wed Sep 29 22:05:59 CEST 2010
On Wed, Sep 29, 2010 at 02:10:23PM -0400, Ronald S. Bultje wrote:
> Hi,
>
> On Wed, Sep 29, 2010 at 1:41 PM, Ronald S. Bultje <rsbultje at gmail.com> wrote:
> > On Wed, Sep 29, 2010 at 1:11 PM, Michael Niedermayer <michaelni at gmx.at> wrote:
> >> On Wed, Sep 29, 2010 at 12:19:45PM -0400, Ronald S. Bultje wrote:
> >>> On Wed, Sep 29, 2010 at 12:03 PM, Michael Niedermayer <michaelni at gmx.at> wrote:
> >>> > On Wed, Sep 29, 2010 at 04:02:32PM +0200, rbultje wrote:
> >>> >> Author: rbultje
> >>> >> Date: Wed Sep 29 16:02:32 2010
> >>> >> New Revision: 25254
> >>> >>
> >>> >> Log:
> >>> >> Remove d_idx as a variable, and instead load it as a constant in the asm.
> >>> >> This has no measurable speed effect because the surrounding code doesn't
> >>> >> take advantage of this yet.
> >>> > [...]
> >>> >> @@ -125,34 +124,41 @@ static av_always_inline void h264_loop_f
> >>> >> ? ? ? ? ? ? ? ? ? ? ? ? ?"por ? ? ? ? ? %%mm1, %%mm0 \n"
> >>> >> ? ? ? ? ? ? ? ? ? ? ? ? ?"pshufw $0x4E, %%mm0, %%mm1 \n"
> >>> >> ? ? ? ? ? ? ? ? ? ? ? ? ?"pminub ? ? ? ?%%mm1, %%mm0 \n"
> >>> >> - ? ? ? ? ? ? ? ? ? ? ? ?::"r"(d_idx),
> >>> >> - ? ? ? ? ? ? ? ? ? ? ? ? ?"r"(ref[0]+b_idx),
> >>> >> - ? ? ? ? ? ? ? ? ? ? ? ? ?"r"(mv[0]+b_idx)
> >>> >> + ? ? ? ? ? ? ? ? ? ? ? ?::"r"(ref[0]+b_idx),
> >>> >> + ? ? ? ? ? ? ? ? ? ? ? ? ?"r"(mv[0]+b_idx),
> >>> >> + ? ? ? ? ? ? ? ? ? ? ? ? ?"i"(d_idx),
> >>> >> + ? ? ? ? ? ? ? ? ? ? ? ? ?"i"(d_idx+40),
> >>> >> + ? ? ? ? ? ? ? ? ? ? ? ? ?"i"(d_idx*4),
> >>> >> + ? ? ? ? ? ? ? ? ? ? ? ? ?"i"(d_idx*4+8),
> >>> >> + ? ? ? ? ? ? ? ? ? ? ? ? ?"i"(d_idx*4+160),
> >>> >> + ? ? ? ? ? ? ? ? ? ? ? ? ?"i"(d_idx*4+168)
> >>> >
> >>> > It appears that some gccs have difficulty with constant propagation, so i
> >>> > suspect that this has to be changed to a macro instead of a always_inline
> >>> > function
> >>>
> >>> Grmbl, starting to think yasm is easier after all... Patch attached.
> >>
> >> Iam starting to think that too in this case ...
> >> either way patch ok
> >
> > Let's leave it as-is now, it was 3 cycles faster after all... If more
> > stuff breaks or becomes strange, we can change it whenever we want...
> > I was positively surprised that clang/icc worked with this code. :-).
>
> Hmm, sunCC is still unhappy...
>
> http://fate.ffmpeg.org/x86_32-linux-suncc-5.11/20100929174548/compile
"h264dsp_mm.tmp.vpn.30352.i", [h264_loop_filter_strength_mmx2]:ube: error: Cannot allocate register for argument '%7' in GASM Inlining
cc: ube failed for /home/mik/.ccache/tmp/h264dsp_mm.tmp.vpn.30352.i
make: *** [libavcodec/x86/h264dsp_mmx.o] Error 2
There is no %7 in there just a %a7 and that is a constant and not a register.
So my bet here is on compiler bug and that should be reported if anyone cares
As workaround the constants can be inserted by the preprocessor
also note that the asembler is able to do basic arithmetic
you can take a look at simple_idct.c for examples of this
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
Observe your enemies, for they first find out your faults. -- Antisthenes
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-cvslog/attachments/20100929/030aaecc/attachment.pgp>
More information about the ffmpeg-cvslog
mailing list