[FFmpeg-devel] Performance of P010LE/BE pixel convertion
h.leppkes at gmail.com
Thu Sep 1 15:52:13 EEST 2016
On Thu, Sep 1, 2016 at 1:34 PM, Timo Rothenpieler <timo at rothenpieler.org> wrote:
>> On Thu, Sep 1, 2016 at 7:00 AM, Ali KIZIL <alikizil at gmail.com> wrote:
>>> Hi Oliver,
>>> I just setup my DDR3 RAM speed to 2133 Mhz on i7 4960x server. It dosnt
>>> make a much difference. FPS is still waiving 41-44 fps for UHD P010LE HEVC
>>> Main 10 encoding.
>>> Also, rawvideo P010LE encodding waiving 39-42 fps. For your note;while FPS
>>> waves from 39-42 fps for YUV420P to P010LE, YUV420P to YUV420P10LE fps is
>>> like 75-76:
>> I think this is expected, the p010le conversion is C (no SIMD). The
>> yuv420p10le conversion is using x86 SIMD (probably AVX).
>> To fix this, add x86 SIMD implementations of the p010le conversions in
>> swscale. Better yet, add direct conversions from yuv420p10 (which I assume
>> is the internal format of your actual source after decoding?) to p010le,
>> first C and then later x86 SIMD.
> I think 40-50 FPS is quite a nice result for UHD with the plain stupid C
> Also, isn't the internal representation of YUV 10bit in swscale
> essentially yuv420p10 anyway, so the conversion already is as direct as
> it gets?
The "generic" step using the internal format is still slower then
using a "special" converter that directly converts the input to the
output without the generic intermediate step.
This would probably be relatively easy to build for yuv420p10le ->
p010le and save some performance.
More information about the ffmpeg-devel