[FFmpeg-devel] Fw: [foms] Paper submissions to LCA

Thu Jul 16 23:45:05 CEST 2009

Alexander Strange <astrange at ithinksw.com> writes:

> On Jul 16, 2009, at 3:33 PM, M?ns Rullg?rd wrote:
>
>> Jason Garrett-Glaser <darkshikari at gmail.com> writes:
>>
>>> The H.264 decoder isn't great because CoreAVC is a crapton faster,
>>> primarily due to better architecture, despite the fact that ffmpeg's
>>> assembly is significantly superior.
>>
>> Could we improve this?
>
> One thing is that the MPEG codecs decode an entire MB into the context
> before running idct/compenstation/etc on them, instead of interleaving
> them, so it wastes some cache misses reloading them. (not to mention
> the contexts being so large in the first place)
>
> For H264 this would make the decoder bigger of course, and you might
> even have to clone more functions.

Well, is it feasible at all, or are we talking about rewriting most
of FFmpeg?

> But there's some good easy optimizations left - making an AV_COPY64
> for 32-bit platforms (use double on PPC, SSE on x86, etc) and
> replacing the uint64_t* copies with it would help, for instance.

Do you have any idea how much this would improve?

> I almost submitted that one already, but we can't have SSE2 out of dsp
> functions yet, since runtime cpudetection can't be turned off for lavc.

I don't see why this is an issue.  If you configure with --cpu=foo,
the compiler is allowed to use instructions specific to CPU foo
anywhere it pleases, so why shouldn't we do the same?

-- 
M?ns Rullg?rd
mans at mansr.com