[FFmpeg-devel] fate : clang x86

Mon Aug 30 22:01:14 CEST 2010

On Mon, Aug 30, 2010 at 3:59 PM, Eli Friedman <eli.friedman at gmail.com> wrote:
> On Mon, Aug 30, 2010 at 12:31 PM, Alex Converse <alex.converse at gmail.com> wrote:
>> On Mon, Aug 30, 2010 at 3:29 PM, Eli Friedman <eli.friedman at gmail.com> wrote:
>>> On Mon, Aug 30, 2010 at 11:56 AM, Ronald S. Bultje <rsbultje at gmail.com> wrote:
>>>> Hi,
>>>>
>>>> On Mon, Aug 30, 2010 at 2:20 PM, Eli Friedman <eli.friedman at gmail.com> wrote:
>>>>> On Mon, Aug 30, 2010 at 9:43 AM, Ronald S. Bultje <rsbultje at gmail.com> wrote:
>>>>>> Hi,
>>>>>>
>>>>>> On Tue, Aug 24, 2010 at 11:56 AM, Michael Niedermayer <michaelni at gmx.at> wrote:
>>>>>>> On Tue, Aug 24, 2010 at 11:48:45AM -0400, Jason Garrett-Glaser wrote:
>>>>>>>> On Tue, Aug 24, 2010 at 11:19 AM, Michael Niedermayer <michaelni at gmx.at> wrote:
>>>>>>>> > On Mon, Aug 23, 2010 at 06:57:20PM -0700, Eli Friedman wrote:
>>>>>>>> >> 2010/8/23 M?ns Rullg?rd <mans at mansr.com>:
>>>>>>>> >> > Eli Friedman <eli.friedman at gmail.com> writes:
>>>>>>>> >> >
>>>>>>>> >> >> 2010/8/21 M?ns Rullg?rd <mans at mansr.com>:
>>>>>>>> >> >>> Eli Friedman <eli.friedman at gmail.com> writes:
>>>>>>>> >> >>>
>>>>>>>> >> >>>> 2010/8/21 M?ns Rullg?rd <mans at mansr.com>:
>>>>>>>> >> >>>>> castet.matthieu at free.fr writes:
>>>>>>>> >> >>>>>
>>>>>>>> >> >>>>>> Hi,
>>>>>>>> >> >>>>>>
>>>>>>>> >> >>>>>> on freebsd "-mllvm -regalloc=fast" cflags are used to make clang/llvm accept
>>>>>>>> >> >>>>>> some inline asm.
>>>>>>>> >> >>>>>>
>>>>>>>> >> >>>>>> May be we should do the same on linux ?
>>>>>>>> >> >>>>>
>>>>>>>> >> >>>>> I tried and failed to figure out what that flag does. ?I assume it
>>>>>>>> >> >>>>> does something with the register allocator, but I'd like to know what.
>>>>>>>> >> >>>>
>>>>>>>> >> >>>> It's a workaround of sorts for
>>>>>>>> >> >>>> http://llvm.org/bugs/show_bug.cgi?id=4668 . ?LLVM essentially has two
>>>>>>>> >> >>>> register allocator implementations: one is the "fast" allocator, which
>>>>>>>> >> >>>> is a local register allocator used for -O0, and the other is the
>>>>>>>> >> >>>> "linear scan" allocator, which is the slower global register allocator
>>>>>>>> >> >>>> used for -O1+. ?"-mllvm -regalloc=fast" forces the use of the "fast"
>>>>>>>> >> >>>> allocator, which leads to slower generated code, but isn't affected by
>>>>>>>> >> >>>> the bug in question.
>>>>>>>> >> >>>
>>>>>>>> >> >>> Sounds like it's not suitable for production use. ?Any chance they'll
>>>>>>>> >> >>> fix the bug?
>>>>>>>> >> >>
>>>>>>>> >> >> In the near future? ?Not very likely... from what I understand, it's a
>>>>>>>> >> >> relatively difficult issue to solve, and bugs rejecting valid inline
>>>>>>>> >> >> asm are generally considered low priority for the people who know the
>>>>>>>> >> >> register allocator well enough to fix this.
>>>>>>>> >> >
>>>>>>>> >> > That leaves two options:
>>>>>>>> >> >
>>>>>>>> >
>>>>>>>> >> > 1. Declare clang officially unsupported for x86_32.
>>>>>>>> >
>>>>>>>> > ok with me
>>>>>>>> >
>>>>>>>> >
>>>>>>>> >> > 2. Fix our code.
>>>>>>>> >>
>>>>>>>> >> Attached attempts option 2. ?It's essentially a straight port of the
>>>>>>>> >> inline asm to yasm, with a couple minor changes to reduce the number
>>>>>>>> >> of arguments. ?This removes all the inline asm blocks from
>>>>>>>> >> mpegvideo_mmx_template.c, which allows clang to successfully build
>>>>>>>> >> ffmpeg. ?Passes regression tests.
>>>>>>>> >
>>>>>>>> > and is slower due to additional call overhead
>>>>>>>>
>>>>>>>> Why would it have additional overhead? ?Both are called by the same
>>>>>>>> function pointer.
>>>>>>>
>>>>>>> - ? ? ? ? ? ?: "+a" (last_non_zero_p1)
>>>>>>> - ? ? ? ? ? ?: "r" (block+64), "r" (qmat), "r" (bias),
>>>>>>> - ? ? ? ? ? ? ?"r" (inv_zigzag_direct16+64), "r" (temp_block+64)
>>>>>>> - ? ? ? ?);
>>>>>>> + ? ? ? ?last_non_zero_p1 =
>>>>>>> + ? ? ? ? ? ?RENAMEcore(ff_dct_quantize_core_h263)(block+64, qmat, bias,
>>>>>>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? inv_zigzag_direct16+64,
>>>>>>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? temp_block+64, overflow);
>>>>>>> ? ? }else{ // FMT_H263
>>>>>>
>>>>>> Eli, are you still working on this?
>>>>>
>>>>> Do you have any suggestions for putting the patch into an acceptable form?
>>>>
>>>> Make the called function a macro and inline it manually in yasm?
>>>
>>> Writing dct_quantize in its current form completely in yasm is a
>>> non-starter; it makes function calls, and depends on offsets into
>>> structs defined in C.
>>>
>>
>> Perhaps you should look at the yasm STRUC macro
>
> I am not going to recode MpegEncContext into yasm.
>

You don't have to recode it, just write a little script to generate
the yasm STRUC from the C struct