[FFmpeg-devel] [PATCH v6 1/3] hevc_mp4toannexb: Insert correct parameter sets before IRAP

Thu Nov 28 20:44:00 EET 2019

Andriy Gelman:
> On Thu, 28. Nov 15:28, Andreas Rheinhardt wrote:
>> Andriy Gelman:
>>> On Tue, 26. Nov 07:24, Andriy Gelman wrote:
>>>> On Tue, 26. Nov 10:52, Michael Niedermayer wrote:
>>>>> On Mon, Nov 25, 2019 at 09:35:04PM -0500, Andriy Gelman wrote:
>>>>>> On Mon, 25. Nov 01:50, Michael Niedermayer wrote:
>>>>>>> On Sun, Nov 24, 2019 at 11:29:18AM -0500, Andriy Gelman wrote:
>>>>>>>> On Sun, 24. Nov 00:02, Michael Niedermayer wrote:
>>>>>>>>> On Tue, Oct 15, 2019 at 10:50:39PM -0400, Andriy Gelman wrote:
>>>>>>>>>> From: Andriy Gelman <andriy.gelman at gmail.com>
>>>>>>>>>>
>>>>>>>>>> Fixes #7799
>>>>>>>>>>
>>>>>>>>>> Currently, the mp4toannexb filter always inserts the same extradata at
>>>>>>>>>> the start of the first IRAP unit. As in ticket #7799, this can lead to
>>>>>>>>>> decoding errors if modified parameter sets are signalled in-band.
>>>>>>>>>>
>>>>>>>>>> Decoding errors/visual artifacts are also present in the following fate-suite/hevc-conformance datasets for hevc->mp4->hevc conversion:
>>>>>>>>>>  -RAP_B_Bossen_1.bit
>>>>>>>>>>  -RPS_C_ericsson_5.bit
>>>>>>>>>>  -SLIST_A_Sony_4.bit
>>>>>>>>>>  -SLIST_B_Sony_8.bit
>>>>>>>>>>  -SLIST_C_Sony_3.bit
>>>>>>>>>>  -SLIST_D_Sony_9.bit
>>>>>>>>>>  -TSKIP_A_MS_2.bit
>>>>>>>>>>
>>>>>>>>>> This commit solves these errors by keeping track of VPS/SPS/PPS parameter sets
>>>>>>>>>> during the conversion. The correct combination is inserted at the start
>>>>>>>>>> of the first IRAP. SEIs from extradata are inserted before each IRAP.
>>>>>>>>>>
>>>>>>>>>> This commit also makes an update to the hevc-bsf-mp4toannexb fate test
>>>>>>>>>> since the result before this patch contained duplicate parameter sets
>>>>>>>>>> in-band.
>>>>>>>>>> ---
>>>>>>>>>>  libavcodec/hevc_mp4toannexb_bsf.c | 503 ++++++++++++++++++++++++++++--
>>>>>>>>>>  tests/fate/hevc.mak               |   2 +-
>>>>>>>>>>  2 files changed, 472 insertions(+), 33 deletions(-)
>>>>>>> [...]
>>>>>>>>>
>>>>>>>>> [...]
>>>>>>>>>> +/*
>>>>>>>>>> + * Function converts mp4 access unit into annexb
>>>>>>>>>> + * Output packet structure
>>>>>>>>>> + * VPS, SPS, PPS, [SEI(from extradata)], [SEI_PREFIX(from access unit)], IRAP, [SEI_SUFFIX]
>>>>>>>>>> + * or
>>>>>>>>>> + * [SEI_PREFIX (from access unit)], [PPS (if not already signalled)], VCL(non-irap), [SEI_SUFFIX]
>>>>>>>>>> + */
>>>>>>>>>>  static int hevc_mp4toannexb_filter(AVBSFContext *ctx, AVPacket *out)
>>>>>>>>>>  {
>>>>>>>>>>      HEVCBSFContext *s = ctx->priv_data;
>>>>>>>>>>      AVPacket *in;
>>>>>>>>>> -    GetByteContext gb;
>>>>>>>>>> -
>>>>>>>>>> -    int got_irap = 0;
>>>>>>>>>> -    int i, ret = 0;
>>>>>>>>>> +    H2645Packet pkt;
>>>>>>>>>> +    int i, j, prev_size, ret;
>>>>>>>>>> +    int irap_done;
>>>>>>>>>>  
>>>>>>>>>>      ret = ff_bsf_get_packet(ctx, &in);
>>>>>>>>>>      if (ret < 0)
>>>>>>>>>>          return ret;
>>>>>>>>>>  
>>>>>>>>>> +    /* output the annexb nalu if extradata is not parsed*/
>>>>>>>>>>      if (!s->extradata_parsed) {
>>>>>>>>>>          av_packet_move_ref(out, in);
>>>>>>>>>>          av_packet_free(&in);
>>>>>>>>>>          return 0;
>>>>>>>>>>      }
>>>>>>>>>>  
>>>>>>>>>
>>>>>>>>>> -    bytestream2_init(&gb, in->data, in->size);
>>>>>>>>>
>>>>>>>>> Is this really better without using an existing bytestream* API ?
>>>>>>>>
>>>>>>>> If I use the API, then I'd have to call bytestraem2_init for each nal unit. I
>>>>>>>> also don't use the bytestream2_get_byte function. It seemed simpler to use the WRITE_NAL macro.
>>>>>>>>
>>>>>>>> But maybe I've misunderstood your question.
>>>>>>>
>>>>>>> i had nothing really specific in mind, just the feeling that using none of
>>>>>>> the existing APIs there is a bit odd.
>>>>>>>
>>>>>>> but more specifically, what about the write side ?
>>>>>>
>>>>>> If I use the bytestream* API, then I'd have to add extra boiler plate code each time I resize the output with av_grow_packet(). I'd have to add something like: 
>>>>>>
>>>>>>     prev_size = bytestream2_tell_p(...); 
>>>>>>     bytestream2_init_writer(...);
>>>>>>     bytestream2_skip_p(..., prev_size);
>>>>>>     
>>>>>> Or maybe the API needs an extra function that reinitializes the pointers but
>>>>>> keeps the current state of the writer. 
>>>>>
>>>>> grow/realloc seems suboptimal to me for any API.
>>>>> cant you find out how much space is needed and allocate/grow just once then
>>>>> init the bytestream2 on that ?
>>>>>
>>>>
>>>> ok, I'll do it this way.
>>>>
>>>
>>> After spending so much time on this patch I've just discovered that the
>>> hevc_metadata bsf can do the same job as hevc_mp4toannexb. The former also
>>> inserts correct parameter sets and so fixes #7799... :(
>>>
>>> So instead of:
>>> ./ffmpeg -i in.mp4 -bsf:v hevc_mp4toannexb -codec:v copy out.hevc
>>>
>>> The user can just run:
>>> ./ffmpeg -i in.mp4 -bsf:v hevc_metadata -codec:v copy out.hevc
>>>
>>> One difference is that hevc_metadata currently only keeps the base layers
>>> (nuh_layer_id == 0), whereas hevc_mp4toannexb copies everything (before my
>>> patch). hevc_metadata will have a slightly higher complexity as it parses the full
>>> parameter sets.
>>>
>>> Can you give me advice on best way to proceed? 
>>>
>>> If the nuh_layer_id issue is solved, can we remove the hevc_mp4toannexb filter
>>> and automatically insert hevc_metadata filter if the former is selected by
>>> the user? 
>>>
>> Hello Andriy,
>>
>> 1. hevc_metadata inserts parameter sets? That is absolutely new to me.
>> How did you check this? (It does not insert parameter sets, but it
>> outputs annex b, so the sample from #7799 is left alone by
>> hevc_mp4toannexb after hevc_metadata, hence the updated in-band
>> extradata is not masked by non-updated header parameter sets).
> 
> You are right, it doesn't insert the referenced parameter sets,
> but just doesn't overwrite them by the inserting the old extradata. 
> 
> I suppose if the media is spliced, then my patch would be still be better
> than hevc_metadata because all the parameter are guaranteed to be in the cvs.
> But there's still overlap imo.
> 
There is overlap in that both output annex b and that cbs could be
used for your purpose if you don't care about speed, but that's about
it. E.g. if you convert a mp4 HEVC (or H.264 for that matter) without
in-band parameter sets to annex b via hevc_metadata/h264_metadata (or
filter_units) and write the output to an elementary stream, it will
not have any parameter sets and be unplayable. (I consider this a bug
in the elementary stream muxer. But low priority.)

>> 2. hevc_metadata is based upon cbs and cbs for H.264 and H.265 uses
>> the h2645_parse functions and the nuh_layer_id issue affects it, too.
> 
>> 3. cbs is (currently) slow. That is because it lets the h2645_parse
>> functions unescape the whole units and the reassembles them at the
>> end. Even though I have an idea for a patch that would greatly speed
>> this up, it will never be as fast as an approach that does not rely on
>> cbs.
> 
> I'm also using ff_h2645_packet_split in my patch, too. It gives the rbsp on all the
> NALs. I think it's possible to set some kind of a bound where only the first x bytes are needed.
> 
Yes, I have been thinking about such a thing, too. One could avoid
this by not calling ff_h2645_packet_split(), but instead calling
ff_h2645_extract_rbsp() with an artificially small length (depending
upon the type of the unit and how far away the information you want
may be).
(This means that you have to parse the sizes of the NAL units
yourself, but them being length-prefixed means that this is trivial.)

- Andreas