[FFmpeg-devel] [PATCH] NVENC Surface Allocation Reduction
Ben Chang
benc at nvidia.com
Wed Apr 26 01:03:34 EEST 2017
Hi Timo,
Thanks for the review. Attaching patch updated with your suggestions and answering some queries from previous email.
>Did you test if and how much it affects performance to reduce the default delay from 32 to 4?
>This was originally done because nvenc is extremely slow if you try to download the frames without some delay headroom.
I have not seen drop in perf on windows in various scenarios (encode only, cpu -> nvenc transcode, nvdec -> nvenc transcode) on several gpu arch (Kepler, Maxwell, and Pascal). In fact, in some cases, the perf increases (by 1-2 fps). I am using the fps # reported by ffmpeg in most cases. Reducing number of surfaces effectively reduces the output delay (async_depth) which I believe is why there is a decrease in encode/transcode time.
> What do you mean by "*2 for number of NVENCs"?
This is a hardcoded value for the number of NVENCs present on a GPU. Commercial gpu can have up to two (most of the time). There is no support yet to inquire number of NVENCs present on gpu with api. I have changed the comment to "multiply by 2 for number of NVENCs on gpu (hardcode to 2)" for more clear wording.
>> --- a/libavcodec/nvenc_h264.c
>> +++ b/libavcodec/nvenc_h264.c
>> @@ -79,8 +79,8 @@ static const AVOption options[] = {
>> 0, AV_OPT_TYPE_CONST, { .i64 = NV_ENC_PARAMS_RC_2_PASS_FRAMESIZE_CAP }, 0, 0, VE, "rc" },
>> { "vbr_2pass", "Multi-pass variable bitrate mode", 0, AV_OPT_TYPE_CONST, { .i64 = NV_ENC_PARAMS_RC_2_PASS_VBR }, 0, 0, VE, "rc" },
>> { "rc-lookahead", "Number of frames to look ahead for rate-control",
>> - OFFSET(rc_lookahead), AV_OPT_TYPE_INT, { .i64 = -1 }, -1, INT_MAX, VE },
>> - { "surfaces", "Number of concurrent surfaces", OFFSET(nb_surfaces), AV_OPT_TYPE_INT, { .i64 = 32 }, 0, MAX_REGISTERED_FRAMES, VE },
>> + OFFSET(rc_lookahead), AV_OPT_TYPE_INT, { .i64 = 0 }, 0, INT_MAX, VE },
>Why the change of default here? Kinda gives up the possibility to differentiate between unset and user-set to 0.
I just thought it would make more sense to have these value be 0 or greater since they may be/are used in positive integer calculation. Currently, there are condition checks to prevent unspecified value (-1) from being used; but I feel user unspecified and user set to 0 are essentially the same thing in rc_looahead and nb_surfaces scenario.
Other things addressed as you suggested:
-remove lockCount from NvencSurface struct as it is no longer referenced
-rename temporary surface variable to tmp_surf to avoid camel casing
-rename IO_surface_queue to unused_surface_queue
-remove pointless braces
-change statement: !(ctx->nb_surfaces > 0) to ctx->nb_surfaces <= 0
-fix mixed code and declaration
Thanks!
Ben
-----------------------------------------------------------------------------------
This email message is for the sole use of the intended recipient(s) and may contain
confidential information. Any unauthorized review, use, disclosure or distribution
is prohibited. If you are not the intended recipient, please contact the sender by
reply email and destroy all copies of the original message.
-----------------------------------------------------------------------------------
-------------- next part --------------
A non-text attachment was scrubbed...
Name: NVENC_surface_allocation_reduction_v2.patch
Type: application/octet-stream
Size: 10199 bytes
Desc: NVENC_surface_allocation_reduction_v2.patch
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20170425/d58d21f0/attachment.obj>
More information about the ffmpeg-devel
mailing list