[FFmpeg-devel] [PATCH] hevc_mp4toannexb: Do not duplicate parameter sets

Tue Aug 13 09:24:00 EEST 2019

Andriy Gelman:
> Andreas,
> 
> On Sun, 21. Jul 10:47, Andreas Rheinhardt wrote:
>> Andriy Gelman:
>>> From: Andriy Gelman <andriy.gelman at gmail.com>
>>>
>>> Fixes #7799
>>>
>>> Currently, the mp4toannexb filter always inserts extradata at the start
>>> of each IRAP unit. This can lead to duplication of parameter sets if the
>>> demuxed packet from mdat atom already contains a version of the
>>> parameters.
>>>
>>> As in ticket #7799 this can also lead to decoding errors when the
>>> parameter sets of the IRAP frames are different from the ones stored in
>>> extradata.
>>>
>>> This commit avoids duplicating the parameter sets if they are already
>>> present in the demuxed packet.
>>>
>>> This commit also makes an update to the hevc-bsf-mp4toannexb fate
>>> test since the result before this patch contained duplicate vps/sps/pps
>>> nal units.
>>> ---
>>>  libavcodec/hevc_mp4toannexb_bsf.c | 9 ++++++++-
>>>  tests/fate/hevc.mak               | 2 +-
>>>  2 files changed, 9 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/libavcodec/hevc_mp4toannexb_bsf.c b/libavcodec/hevc_mp4toannexb_bsf.c
>>> index 09bce5b34c..5c27306b09 100644
>>> --- a/libavcodec/hevc_mp4toannexb_bsf.c
>>> +++ b/libavcodec/hevc_mp4toannexb_bsf.c
>>> @@ -123,6 +123,7 @@ static int hevc_mp4toannexb_filter(AVBSFContext *ctx, AVPacket *out)
>>>  
>>>      int got_irap = 0;
>>>      int i, ret = 0;
>>> +    int vps_detected, sps_detected, pps_detected = 0;
>>>  
>> You are only initiallizing pps_detected.
>>
>>>      ret = ff_bsf_get_packet(ctx, &in);
>>>      if (ret < 0)
>>> @@ -146,9 +147,15 @@ static int hevc_mp4toannexb_filter(AVBSFContext *ctx, AVPacket *out)
>>>  
>>>          nalu_type = (bytestream2_peek_byte(&gb) >> 1) & 0x3f;
>>>  
>>> +        switch (nalu_type) {
>>> +          case HEVC_NAL_VPS: vps_detected = 1; break;
>>> +          case HEVC_NAL_SPS: sps_detected = 1; break;
>>> +          case HEVC_NAL_PPS: pps_detected = 1; break;
>>> +        }
>>> +
>>>          /* prepend extradata to IRAP frames */
>>>          is_irap       = nalu_type >= 16 && nalu_type <= 23;
>>> -        add_extradata = is_irap && !got_irap;
>>> +        add_extradata = is_irap && !got_irap && !(vps_detected && sps_detected && pps_detected);
>>>          extra_size    = add_extradata * ctx->par_out->extradata_size;
>>>          got_irap     |= is_irap;
>>>  
>> There are two things that I don't like about this approach:
>> 1. Image an input file like this: VPS, SPS and PPS in extradata and
>> then after a few GOPs there is an inband PPS as part of a VPS, SPS and
>> PPS combination where the PPS differs from the PPS in the extradata.
>> After this change in parameter sets there are no further inband
>> parameter sets. With your proposal, the extradata would again be
>> inserted into access units with IRAP frames after the change in
>> extradata and it would be the old, outdated extradata, thereby
>> breaking decoding (the original mp4 file would probably not be able
>> fully seekable in this case, but that would be another story).
>> 2. If the sample in #7799 contained only the parameter set that
>> actually changed inband, your proposal would still add a whole
>> VPS+SPS+PPS combination and therefore still make the sample unplayable.
>>
> 
> Thanks for your feedback. 
> I believe I'm quite close to finishing the new patch. 
> 
> The general workflow in the updated mp4toannexb_filter function is: 
>   1. Convert input avcc packet to annexb.
>   2. Split the annexb bytestream into NAL units using ff_h2645_packet_splits function
This function is quite slow (it checks the whole NAL unit for 0x03
escapes and if it finds any, it unescapes the whole NAL unit, i.e. it
copies the NAL unit with the 0x03 escapes removed). Do you limit
unescaping to the relevant NAL units only?
>   3. If VPS/SPS/PPS nal units are present, parse them to get their id and reference to higher level parameter set.
>   4. New VPS/SPS/PPS parameter sets are compared their 'cached version'. The cached versions are stored in a struct similar to HEVCParamSets. If the new parameter set is
>   different, then update the cached version.
>   5. Update extradata if any of the cached parameter sets have changed.
If I am not mistaken, then you are not allowed to modify the extradata
after init. All you could do is send side data of
AV_PKT_DATA_NEW_EXTRADATA type. But it is probably enough to simply
add the new extradata in-band.

>   6. Insert new extradata at the first IRAP nal unit (i.e.
>   same as original code).
> 
> The only part that still remains to be done is dealing with SEI nals. What I'm unsure about is whether SEIs from past extradata should be inserted at the start of future IRAP frames. The current code seems to do this. 
> I think the consistent approach is to add SEI prefix/suffix into extradata. But, also update SEI if new version are encountered in the bytestream.
New version means same payload type, but different payload? That's at
least the only generic way of doing this I can think. But the problem
is that the persistence of the SEIs depends on many factors (involving
ids), so that I don't think that 100% correct generic way of handling
SEIs is possible. If I were you, I'd go the path of least resistance
and add the SEIs in the extradata to the output extradata and dump
these SEIs into the first access unit.

- Andreas