[FFmpeg-cvslog] r25254 - trunk/libavcodec/x86/h264dsp_mmx.c

Ronald S. Bultje rsbultje
Thu Sep 30 00:16:16 CEST 2010


Hi,

On Wed, Sep 29, 2010 at 4:05 PM, Michael Niedermayer <michaelni at gmx.at> wrote:
> On Wed, Sep 29, 2010 at 02:10:23PM -0400, Ronald S. Bultje wrote:
>> On Wed, Sep 29, 2010 at 1:41 PM, Ronald S. Bultje <rsbultje at gmail.com> wrote:
>> > On Wed, Sep 29, 2010 at 1:11 PM, Michael Niedermayer <michaelni at gmx.at> wrote:
>> >> On Wed, Sep 29, 2010 at 12:19:45PM -0400, Ronald S. Bultje wrote:
>> >>> On Wed, Sep 29, 2010 at 12:03 PM, Michael Niedermayer <michaelni at gmx.at> wrote:
>> >>> > On Wed, Sep 29, 2010 at 04:02:32PM +0200, rbultje wrote:
>> >>> >> Author: rbultje
>> >>> >> Date: Wed Sep 29 16:02:32 2010
>> >>> >> New Revision: 25254
>> >>> >>
>> >>> >> Log:
>> >>> >> Remove d_idx as a variable, and instead load it as a constant in the asm.
>> >>> >> This has no measurable speed effect because the surrounding code doesn't
>> >>> >> take advantage of this yet.
>> >>> > [...]
>> >>> >> @@ -125,34 +124,41 @@ static av_always_inline void h264_loop_f
>> >>> >> ? ? ? ? ? ? ? ? ? ? ? ? ?"por ? ? ? ? ? %%mm1, %%mm0 \n"
>> >>> >> ? ? ? ? ? ? ? ? ? ? ? ? ?"pshufw $0x4E, %%mm0, %%mm1 \n"
>> >>> >> ? ? ? ? ? ? ? ? ? ? ? ? ?"pminub ? ? ? ?%%mm1, %%mm0 \n"
>> >>> >> - ? ? ? ? ? ? ? ? ? ? ? ?::"r"(d_idx),
>> >>> >> - ? ? ? ? ? ? ? ? ? ? ? ? ?"r"(ref[0]+b_idx),
>> >>> >> - ? ? ? ? ? ? ? ? ? ? ? ? ?"r"(mv[0]+b_idx)
>> >>> >> + ? ? ? ? ? ? ? ? ? ? ? ?::"r"(ref[0]+b_idx),
>> >>> >> + ? ? ? ? ? ? ? ? ? ? ? ? ?"r"(mv[0]+b_idx),
>> >>> >> + ? ? ? ? ? ? ? ? ? ? ? ? ?"i"(d_idx),
>> >>> >> + ? ? ? ? ? ? ? ? ? ? ? ? ?"i"(d_idx+40),
>> >>> >> + ? ? ? ? ? ? ? ? ? ? ? ? ?"i"(d_idx*4),
>> >>> >> + ? ? ? ? ? ? ? ? ? ? ? ? ?"i"(d_idx*4+8),
>> >>> >> + ? ? ? ? ? ? ? ? ? ? ? ? ?"i"(d_idx*4+160),
>> >>> >> + ? ? ? ? ? ? ? ? ? ? ? ? ?"i"(d_idx*4+168)
>> >>> >
>> >>> > It appears that some gccs have difficulty with constant propagation, so i
>> >>> > suspect that this has to be changed to a macro instead of a always_inline
>> >>> > function
>> >>>
>> >>> Grmbl, starting to think yasm is easier after all... Patch attached.
>> >>
>> >> Iam starting to think that too in this case ...
>> >> either way patch ok
>> >
>> > Let's leave it as-is now, it was 3 cycles faster after all... If more
>> > stuff breaks or becomes strange, we can change it whenever we want...
>> > I was positively surprised that clang/icc worked with this code. :-).
>>
>> Hmm, sunCC is still unhappy...
>>
>> http://fate.ffmpeg.org/x86_32-linux-suncc-5.11/20100929174548/compile
>
> "h264dsp_mm.tmp.vpn.30352.i", [h264_loop_filter_strength_mmx2]:ube: error: Cannot allocate register for argument '%7' in GASM Inlining
> cc: ube failed for /home/mik/.ccache/tmp/h264dsp_mm.tmp.vpn.30352.i
> make: *** [libavcodec/x86/h264dsp_mmx.o] Error 2
>
> There is no %7 in there just a %a7 and that is a constant and not a register.
> So my bet here is on compiler bug and that should be reported if anyone cares

Reported, caseID = 1880039 (it's not anywhere publically viewable (yet).

Ronald



More information about the ffmpeg-cvslog mailing list