[FFmpeg-devel] [PATCH] yadif port to libavfitler

Baptiste Coudurier baptiste.coudurier
Mon May 24 03:03:50 CEST 2010


On 5/23/10 4:22 AM, Michael Niedermayer wrote:
> On Sun, May 23, 2010 at 04:10:57AM -0700, Baptiste Coudurier wrote:
>> Hi Michael,
>>
>> On 5/23/10 3:53 AM, Michael Niedermayer wrote:
>>> On Thu, May 20, 2010 at 07:52:31PM -0700, Baptiste Coudurier wrote:
>>>> On 05/20/2010 01:37 AM, Baptiste Coudurier wrote:
>>>>> Hi guys,
>>>>>
>>>>> Here is my first attempt to port yadif to libavfilter.
>>>>>
>>>>> I chose the request_frame paradigm since the filter has delay.
>>>>> What happens when no prev frame is available ? Original code seems to
>>>>> read from
>>>
>>> read from?
>>> (it obviously should use spatial/directional interpolation for the very
>>>    first frame)
>>>
>>>
>>>>>
>>>>> I'd like extensive review if possible and especially instructions on how
>>>>> to retrieve cpu flags at runtime for MMX.
>>>
>>> hmm, i guess we need to move libavcodec/x86/cpuid.c to libavutil
>>>
>>>
>>>>>
>>>>> Also Michael, can you please check the calls to filter() ?
>>>
>>> what do you want me to check exactly?
>>
>> The parameters in the filter() call. I'm not sure I fully understand the
>> logic width fields and parity. Should it be 1 ^ tff if we output only
>> frames for now ?
>
> parity selects if we interpolate even or odd lines iirc so that should stay
> (that is from memory havnt rechecked the code)
>
>
>>
>>>>> I'm also not sure how the field output mode should be supported.
>>>
>>> for every 2nd request_frame() we would just call our sources
>>> request_frame()
>>> (the variant with calling 2 start/slice/end in one request_frame() might
>>> work
>>>    too but i thinks its less ideal)
>>>
>>>
>>>>
>>>> Patch updated, offseting the buffers as needed (please double check), and
>>>> memleak free :>
>>>> Questions still stands :)
>>>> The filter works without copying any input picture which is nice.
>>>
>>> does md5 match to mplayers ?
>>
>> Is there a md5 in ffmpeg available already ? Otherwise I can cook something
>> up.
>
> hmm, seems not, i am just spotting crcenc.c which uses adler32 and our
> md5 utility code

Thanks to Reimar we have one now, framemd5 should be applied asap, it's 
very useful when porting filters.

>>
>>> [...]
>>>> +static void filter_line_mmx2(AVFilterContext *ctx, uint8_t *dst,
>>>> +                             uint8_t *prev, uint8_t *cur, uint8_t *next,
>>>> +                             int w, int refs, int parity)
>>>> +{
>>>> +    YADIFContext *yadif = ctx->priv;
>>>> +    static const uint64_t pw_1 = 0x0001000100010001ULL;
>>>> +    static const uint64_t pb_1 = 0x0101010101010101ULL;
>>>> +    const int mode = yadif->mode;
>>>> +    uint64_t tmp0, tmp1, tmp2, tmp3;
>>>> +    int x;
>>>> +
>>>> +#define FILTER\
>>>> +    for(x=0; x<w; x+=4){\
>>>> +        __asm__ volatile(\
>>>> +            "pxor      %%mm7, %%mm7 \n\t"\
>>>> +            LOAD4("(%[cur],%[mrefs])", %%mm0) /* c = cur[x-refs] */\
>>>> +            LOAD4("(%[cur],%[prefs])", %%mm1) /* e = cur[x+refs] */\
>>>> +            LOAD4("(%["prev2"])", %%mm2) /* prev2[x] */\
>>>> +            LOAD4("(%["next2"])", %%mm3) /* next2[x] */\
>>>> +            "movq      %%mm3, %%mm4 \n\t"\
>>>> +            "paddw     %%mm2, %%mm3 \n\t"\
>>>> +            "psraw     $1,    %%mm3 \n\t" /* d = (prev2[x] +
>>>> next2[x])>>1 */\
>>>> +            "movq      %%mm0, %[tmp0] \n\t" /* c */\
>>>> +            "movq      %%mm3, %[tmp1] \n\t" /* d */\
>>>> +            "movq      %%mm1, %[tmp2] \n\t" /* e */\
>>>> +            "psubw     %%mm4, %%mm2 \n\t"\
>>>> +            PABS(      %%mm4, %%mm2) /* temporal_diff0 */\
>>>> +            LOAD4("(%[prev],%[mrefs])", %%mm3) /* prev[x-refs] */\
>>>> +            LOAD4("(%[prev],%[prefs])", %%mm4) /* prev[x+refs] */\
>>>> +            "psubw     %%mm0, %%mm3 \n\t"\
>>>> +            "psubw     %%mm1, %%mm4 \n\t"\
>>>> +            PABS(      %%mm5, %%mm3)\
>>>> +            PABS(      %%mm5, %%mm4)\
>>>> +            "paddw     %%mm4, %%mm3 \n\t" /* temporal_diff1 */\
>>>> +            "psrlw     $1,    %%mm2 \n\t"\
>>>> +            "psrlw     $1,    %%mm3 \n\t"\
>>>> +            "pmaxsw    %%mm3, %%mm2 \n\t"\
>>>> +            LOAD4("(%[next],%[mrefs])", %%mm3) /* next[x-refs] */\
>>>> +            LOAD4("(%[next],%[prefs])", %%mm4) /* next[x+refs] */\
>>>> +            "psubw     %%mm0, %%mm3 \n\t"\
>>>> +            "psubw     %%mm1, %%mm4 \n\t"\
>>>> +            PABS(      %%mm5, %%mm3)\
>>>> +            PABS(      %%mm5, %%mm4)\
>>>> +            "paddw     %%mm4, %%mm3 \n\t" /* temporal_diff2 */\
>>>> +            "psrlw     $1,    %%mm3 \n\t"\
>>>> +            "pmaxsw    %%mm3, %%mm2 \n\t"\
>>>> +            "movq      %%mm2, %[tmp3] \n\t" /* diff */\
>>>                                    ^^^^^^
>>> code like this does not wor with all gcc versions, (it needs a equivalent
>>> of
>>> the NAMED_ASM_ARGS from mplayer
>>
>> How should this be changed ?
>> Or can I just enable it for a specific gcc version to get it into svn
>> quickly ?
>
> depends on if you are afraid of mans
> i wouldnt mind but mans will want a proper configure check i think
>
>
>>
>>> [...]
>>>> +
>>>> +static void filter(AVFilterContext *ctx, AVFilterPicRef *dstpic,
>>>> +                   int parity, int tff)
>>>> +{
>>>> +    YADIFContext *yadif = ctx->priv;
>>>> +    int y, i;
>>>> +
>>>> +    for (i = 0; i<   3; i++) {
>>>> +        int is_chroma = !!i;
>>>> +        int w = dstpic->w>>   is_chroma;
>>>> +        int h = dstpic->h>>   is_chroma;
>>>> +        int refs = yadif->cur->linesize[i];
>>>> +
>>>> +        for (y = 0; y<   h; y++) {
>>>> +            if ((y ^ parity)&   1) {
>>>> +                uint8_t *prev =&yadif->prev->data[i][y*refs];
>>>> +                uint8_t *cur  =&yadif->cur ->data[i][y*refs];
>>>> +                uint8_t *next =&yadif->next->data[i][y*refs];
>>>> +                uint8_t *dst  =&dstpic->data[i][y*dstpic->linesize[i]];
>>>> +                yadif->filter_line(ctx, dst, prev, cur, next, w, refs,
>>>> parity ^ tff);
>>>> +            } else {
>>>> +                memcpy(&dstpic->data[i][y*dstpic->linesize[i]],
>>>> +&yadif->cur->data[i][y*refs], w);
>>>
>>> switching from fast_memcpy() (that works with SIMD) to memcpy() possibly
>>> looses speed
>>
>> Do we have a replacement ?
>
> i dont think we do.
>
>
>>
>>> [...]
>>>> +static int query_formats(AVFilterContext *ctx)
>>>> +{
>>>> +    static const enum PixelFormat pix_fmts[] = {
>>>> +        PIX_FMT_YUV420P,
>>>> +        PIX_FMT_GRAY8,
>>>> +        PIX_FMT_NONE
>>>> +    };
>>>
>>> it should be trivial to support other yuv formats
>>>
>>> other improvments that could be tried once its all in svn are to
>>> try to pass draw_slice() calls through from the source and maybe to
>>> pass get_video_buffer() through to avoid the every 2nd line memcpy.
>>
>> Well, with the delay it might be tricky.
>> In short what's required before svn inclusion ?
>
> cpuid + matching md5 to mplayer
> (id also like to see double frame output to work if possible but if thats
>   a problem we could look into it later)

I'd go for later :)

> and if fast_memcpy isnt ported we need a todo/fixme somewhere so we dont
> forget it

md5 matches, cpuid is a bit complicated to port since MM_FLAGS are 
defined in avcodec.h

Should memcpy be ported to libavutil ?

-- 
Baptiste COUDURIER
Key fingerprint                 8D77134D20CC9220201FC5DB0AC9325C5C1ABAAA
FFmpeg maintainer                                  http://www.ffmpeg.org



More information about the ffmpeg-devel mailing list