[FFmpeg-devel] [PATCH] libavcodec/videotoolbox: fix decoding of h264 streams with minor SPS changes

Sat Nov 18 01:13:59 EET 2017

On Fri, Nov 17, 2017 at 11:44 PM, Aman Gupta <ffmpeg at tmm1.net> wrote:
> On Wed, Nov 15, 2017 at 1:57 PM, Hendrik Leppkes <h.leppkes at gmail.com>
> wrote:
>
>> On Wed, Nov 15, 2017 at 10:15 PM, Aman Gupta <ffmpeg at tmm1.net> wrote:
>> > From: Aman Gupta <aman at tmm1.net>
>> >
>> > Previously the codec kept an entire copy of the SPS, and restarted the
>> VT decoder
>> > session whenever it changed. This fixed decoding errors in [1], as
>> > described in 9519983c. On further inspection, that sample features an
>> SPS change
>> > from High/4.0 to High/3.2 while moving from one scene to another.
>> >
>> > Yesterday I received [2], which contains minor SPS changes where the
>> > profile and level do not change. These occur frequently and are not
>> associated with
>> > scene changes. After 9519983c, the VT decoder session is recreated
>> unnecessarily when
>> > these are encountered causing visual glitches.
>> >
>> > This commit simplifies the state kept in the VTContext to include just
>> the first three
>> > bytes of the SPS, containing the profile and level details. This is
>> populated initially
>> > when the VT decoder session is created, and used to detect changes and
>> force a restart.
>> >
>> > This means minor SPS changes are fed directly into the existing decoder,
>> whereas
>> > profile/level changes force the decoder session to be recreated with the
>> new parameters.
>> >
>>
>> The profile and level are not exactly the only things that can change
>> to force a decoder to be re-created.
>> How about the frame dimensions, within the same level?
>>
>
> I compared the different SPS present in the spschange2.ts sample, and here
> is a diff:
>
>  ======= SPS =======
>   profile_idc : 100
>   constraint_set0_flag : 0
>   constraint_set1_flag : 0
>   constraint_set2_flag : 0
>   constraint_set3_flag : 0
>   constraint_set4_flag : 0
>   constraint_set5_flag : 0
>   reserved_zero_2bits : 0
>   level_idc : 40
>   seq_parameter_set_id : 0
>   chroma_format_idc : 1
>   residual_colour_transform_flag : 0
>   bit_depth_luma_minus8 : 0
>   bit_depth_chroma_minus8 : 0
>   qpprime_y_zero_transform_bypass_flag : 0
>   seq_scaling_matrix_present_flag : 1
>   log2_max_frame_num_minus4 : 0
> - pic_order_cnt_type : 1
> + pic_order_cnt_type : 0
> -   log2_max_pic_order_cnt_lsb_minus4 : 0
> +   log2_max_pic_order_cnt_lsb_minus4 : 3
>     delta_pic_order_always_zero_flag : 0
>     offset_for_non_ref_pic : 0
> -   offset_for_top_to_bottom_field : 7
> +   offset_for_top_to_bottom_field : 0
> -   num_ref_frames_in_pic_order_cnt_cycle : 7
> +   num_ref_frames_in_pic_order_cnt_cycle : 0
> - num_ref_frames : 3
> + num_ref_frames : 0
> - gaps_in_frame_num_value_allowed_flag : 0
> + gaps_in_frame_num_value_allowed_flag : 1
> - pic_width_in_mbs_minus1 : 13
> + pic_width_in_mbs_minus1 : 81
> - pic_height_in_map_units_minus1 : 2
> + pic_height_in_map_units_minus1 : 38
>   frame_mbs_only_flag : 1
>   mb_adaptive_frame_field_flag : 0
>   direct_8x8_inference_flag : 0
>   frame_cropping_flag : 0
>     frame_crop_left_offset : 0
>     frame_crop_right_offset : 0
>     frame_crop_top_offset : 0
>     frame_crop_bottom_offset : 0
> - vui_parameters_present_flag : 1
> + vui_parameters_present_flag : 0
>
> Interestingly, the pic_height/pic_width do in fact change already in that
> sample. But the correct thing to do, as far as decoding with VideoToolbox,
> is to keep the same decompression session instance and pass the new SPS
> NALU into the decoder along with the image slices.
>

Does it actually properly output images of the new size in that case?

All I'm saying is that profile and level may not exactly be the
parameters that actually require a re-creation. Profile maybe, but
level unlikely. And there might as well be others.
Is this not documented?

- Hendrik