[FFmpeg-devel] [PATCH][RFC] avcodec: disallow hwaccel with frame threads

Andy Furniss adf.lists at gmail.com
Wed Jan 20 13:14:12 CET 2016


Hendrik Leppkes wrote:
> On Wed, Jan 20, 2016 at 12:13 PM, Andy Furniss <adf.lists at gmail.com> wrote:
>> wm4 wrote:
>>>
>>> On Wed, 20 Jan 2016 09:59:12 +0000 Andy Furniss <adf.lists at gmail.com>
>>> wrote:
>>>
>>>> wm4 wrote:
>>>>>
>>>>> On Wed, 20 Jan 2016 00:42:18 +0000 Andy Furniss
>>>>> <adf.lists at gmail.com> wrote:
>>>>>
>>>>>> Hendrik Leppkes wrote:
>>>>>>
>>>>>>>>> I do not agree that it should be a warning. As outlined
>>>>>>>>> in the commit message and this thread, there are serious
>>>>>>>>> flaws with frame threading and hwaccel.
>>>>>>>>
>>>>>>>>
>>>>>>>> I'm fine with it being an error, but since it is an API
>>>>>>>> change, it should follow the usual deprecation period to
>>>>>>>> allow downstream users time to fix it. Meanwhile it can be
>>>>>>>> a warning so that people notice the problem.
>>>>>>>
>>>>>>>
>>>>>>> Its fundamentally broken, and making it a warning would
>>>>>>> re-introduce known crashes. So no.
>>>>>>
>>>>>>
>>>>>> So are the flaws in ffmpeg or particular drivers?
>>>>>>
>>>>>> It does seem a shame perf wise, I've been testing my AMD UVD
>>>>>> decode recently and for 500 UHD frames in a really high
>>>>>> bitrate h264 file it's like -
>>>>>>
>>>>>> ffmpeg threaded = 16 sec.
>>>>>>
>>>>>> ffmpeg single thread = 20 sec.
>>>>>
>>>>>
>>>>> With or without hwaccel?
>>>>
>>>>
>>>> Both are with hwaccel. ffmpeg 2.8.4 cli
>>>>
>>>> Admittedly a very concocted benchmark with a very high bitrate
>>>> sample.
>>>>
>>>> I know on normal x264 stuff my CPU could beat GPU anyway as the
>>>> copy back seems to hurt quite a lot/UVD is for playing.
>>>>
>>>> For future GPUs that will do hevc I guess it could be more
>>>> relevant.
>>>>
>>>>>> gstreamer vaapi 14 sec.
>>>>>>
>>>>>> gstreamer omx 10 sec.
>>>>>>
>>>>>> the omx is a faster as the others do nv12 -> I420 on cpu
>>>>>> (AFAICT)
>>>>>>
>>>>>> Maybe -threads 1 hurts perf by limiting the format conversion
>>>>>> as well?
>>>>>>
>>>>>> Is there a way to get whatever the h/w spits out (nv12)
>>>>>> directly? Trying to ask for nv12 seemed to get a double
>>>>>> conversion.
>>>>>
>>>>>
>>>>> Both vdpau and vaapi can retrieve image data as nv12.
>>>>
>>>>
>>>> With ffmpeg cli?
>>>
>>>
>>> ffmpeg cli doesn't support vaapi.
>>
>>
>> I know, but it does do vdpau: so to restate my question in a clearer way -
>>
>> Is it possible with ffmpeg cli -hwaccel vdpau to avoid nv12 -> I420
>> conversion?
>>
>>> With what hardware was this? What were the command lines you used?
>>
>>
>> GPU is AMD R9285 TONGA (drivers are still new/experemental) but results
>> for this test seem consistent.
>>
>> Rest of system is older Phenom II x4 3.4GHz (cpufreq forced for tests).
>> Mobo is PCIE 2.0 *Gig ram @ 1333.
>>
>> Here's a paste I made a few days ago - includes s/w test (which by
>> chance for this sample comes out the same as multithread h/w) You can
>> see from the time cpu use output I really do get h/w decode as requested.
>
> If you would run with verbose logging (-loglevel verbose), you would
> get information which format the vdpau code uses to retrieve the data,
> as well as clear confirmation that it uses hwaccel.

Thanks, I don't see input format, but I guess the graph 0 ... line is
saying that something is being converted to yuv420p.

So back to one of my questions - how to avoid this?

If I specify -pix_fmt nv12 I just get a double conversion.

ffmpeg -loglevel verbose -threads 1 -hwaccel vdpau -i 
/mnt/ramdisk/x264-otc-2160p60-220M.mkv -f rawvideo /mnt/ramdisk/raw

ffmpeg version N-77817-gd3fe2e0 Copyright (c) 2000-2016 the FFmpeg 
developers
   built with gcc 5.3.0 (GCC)
   configuration: --prefix=/usr --disable-doc --enable-gpl 
--enable-libvpx --enable-libx265 --enable-libdcadec --enable-libmp3lame 
--enable-libx264
   libavutil      55. 13.100 / 55. 13.100
   libavcodec     57. 22.100 / 57. 22.100
   libavformat    57. 21.101 / 57. 21.101
   libavdevice    57.  0.100 / 57.  0.100
   libavfilter     6. 23.100 /  6. 23.100
   libswscale      4.  0.100 /  4.  0.100
   libswresample   2.  0.101 /  2.  0.101
   libpostproc    54.  0.100 / 54.  0.100
Input #0, matroska,webm, from '/mnt/ramdisk/x264-otc-2160p60-220M.mkv':
   Metadata:
     ENCODER         : Lavf57.19.100
   Duration: 00:00:08.33, start: 0.000000, bitrate: 218593 kb/s
     Stream #0:0: Video: h264 (High), 5 reference frames, yuv420p, 
3840x2160 [SAR 1:1 DAR 16:9], 60 fps, 60 tbr, 1k tbn, 120 tbc (default)
     Metadata:
       ENCODER         : Lavc57.16.101 libx264
       DURATION        : 00:00:08.334000000
[graph 0 input from stream 0:0 @ 0x347b100] w:3840 h:2160 pixfmt:yuv420p 
tb:1/1000 fr:60/1 sar:1/1 sws_param:flags=2
Output #0, rawvideo, to '/mnt/ramdisk/raw':
   Metadata:
     encoder         : Lavf57.21.101
     Stream #0:0: Video: rawvideo, 1 reference frame (I420 / 
0x30323449), yuv420p, 3840x2160 [SAR 1:1 DAR 16:9], q=2-31, 200 kb/s, 60 
fps, 60 tbn, 60 tbc (default)
     Metadata:
       DURATION        : 00:00:08.334000000
       encoder         : Lavc57.22.100 rawvideo
Stream mapping:
   Stream #0:0 -> #0:0 (h264 (native) -> rawvideo (native))
Press [q] to stop, [?] for help
Using VDPAU -- G3DVL VDPAU Driver Shared Library version 1.0 -- on X11 
display :0.0, to decode input stream #0:0.




More information about the ffmpeg-devel mailing list