[FFmpeg-user] slow performance with lut3d filter
elliottbalsley at gmail.com
Sun Feb 16 02:37:28 CET 2014
Wow that's a big improvement! Thanks for working on it. How do I use your changes, can I just pull from the git master?
On Feb 13, 2014, at 5:01 AM, Clément Bœsch wrote:
> On Wed, Feb 12, 2014 at 10:29:24AM +0100, Clément Bœsch wrote:
>> On Tue, Feb 11, 2014 at 11:20:43PM +0100, Clément Bœsch wrote:
>>> On Tue, Feb 11, 2014 at 02:11:43PM -0800, Elliott Balsley wrote:
>>>> I'm getting very slow performance with the lut3d filter. Is this
>>>> normal, or is there some way to improve it? This test encode runs at
>>>> 5fps, compared to the same operation without lut3d at 40fps. CPU
>>>> usage is less than 10% during the whole encode. I'm using a 12 core
>>>> Mac Pro. Source footage is ProRes 4444 from Arri Alexa camera.
>>> Yes it's slow for several reasons:
>>> 1) it request rgb, so there is convert from yuv to rgb and back
>>> 2) lut3d is cpu only for now (no GPU accel)
>>> 3) lut3d has no SIMD
>>> 4) lut3d is not threaded
>>> Threading could probably be added. SIMD might not be worth the effort, but
>>> patches welcome. GPU can be interesting, patch welcome as well I suppose.
>>> Not much to do about the first point, except add some specific optims in
>>> Feel free to open a ticket, and eventually add a bounty. Or send some
>>> patches if you're a developer.
>> I just committed a small change to make it faster; I got a few more FPS,
>> you might want to try. Still no threading, SIMD & friends, that was a
>> trivial change. We can probably make it a bit more faster by removing one
>> if in the inner loop, but I'll do that later.
> Removing the inner loop didn't affect the performance, the compiler is
> probably smart. OTOH, I added slice threading and went from 14 to 34 fps
> with a 4 core HT cpu on a h264 1080p footage.
> You might want to give it a try.
> Clément B.
> ffmpeg-user mailing list
> ffmpeg-user at ffmpeg.org
More information about the ffmpeg-user