[FFmpeg-devel] [PATCH v6 1/3] hevc_mp4toannexb: Insert correct parameter sets before IRAP

Thu Nov 28 19:17:31 EET 2019

On Thu, 28. Nov 15:28, Andreas Rheinhardt wrote:
> Andriy Gelman:
> > On Tue, 26. Nov 07:24, Andriy Gelman wrote:
> >> On Tue, 26. Nov 10:52, Michael Niedermayer wrote:
> >>> On Mon, Nov 25, 2019 at 09:35:04PM -0500, Andriy Gelman wrote:
> >>>> On Mon, 25. Nov 01:50, Michael Niedermayer wrote:
> >>>>> On Sun, Nov 24, 2019 at 11:29:18AM -0500, Andriy Gelman wrote:
> >>>>>> On Sun, 24. Nov 00:02, Michael Niedermayer wrote:
> >>>>>>> On Tue, Oct 15, 2019 at 10:50:39PM -0400, Andriy Gelman wrote:
> >>>>>>>> From: Andriy Gelman <andriy.gelman at gmail.com>
> >>>>>>>>
> >>>>>>>> Fixes #7799
> >>>>>>>>
> >>>>>>>> Currently, the mp4toannexb filter always inserts the same extradata at
> >>>>>>>> the start of the first IRAP unit. As in ticket #7799, this can lead to
> >>>>>>>> decoding errors if modified parameter sets are signalled in-band.
> >>>>>>>>
> >>>>>>>> Decoding errors/visual artifacts are also present in the following fate-suite/hevc-conformance datasets for hevc->mp4->hevc conversion:
> >>>>>>>>  -RAP_B_Bossen_1.bit
> >>>>>>>>  -RPS_C_ericsson_5.bit
> >>>>>>>>  -SLIST_A_Sony_4.bit
> >>>>>>>>  -SLIST_B_Sony_8.bit
> >>>>>>>>  -SLIST_C_Sony_3.bit
> >>>>>>>>  -SLIST_D_Sony_9.bit
> >>>>>>>>  -TSKIP_A_MS_2.bit
> >>>>>>>>
> >>>>>>>> This commit solves these errors by keeping track of VPS/SPS/PPS parameter sets
> >>>>>>>> during the conversion. The correct combination is inserted at the start
> >>>>>>>> of the first IRAP. SEIs from extradata are inserted before each IRAP.
> >>>>>>>>
> >>>>>>>> This commit also makes an update to the hevc-bsf-mp4toannexb fate test
> >>>>>>>> since the result before this patch contained duplicate parameter sets
> >>>>>>>> in-band.
> >>>>>>>> ---
> >>>>>>>>  libavcodec/hevc_mp4toannexb_bsf.c | 503 ++++++++++++++++++++++++++++--
> >>>>>>>>  tests/fate/hevc.mak               |   2 +-
> >>>>>>>>  2 files changed, 472 insertions(+), 33 deletions(-)
> >>>>> [...]
> >>>>>>>
> >>>>>>> [...]
> >>>>>>>> +/*
> >>>>>>>> + * Function converts mp4 access unit into annexb
> >>>>>>>> + * Output packet structure
> >>>>>>>> + * VPS, SPS, PPS, [SEI(from extradata)], [SEI_PREFIX(from access unit)], IRAP, [SEI_SUFFIX]
> >>>>>>>> + * or
> >>>>>>>> + * [SEI_PREFIX (from access unit)], [PPS (if not already signalled)], VCL(non-irap), [SEI_SUFFIX]
> >>>>>>>> + */
> >>>>>>>>  static int hevc_mp4toannexb_filter(AVBSFContext *ctx, AVPacket *out)
> >>>>>>>>  {
> >>>>>>>>      HEVCBSFContext *s = ctx->priv_data;
> >>>>>>>>      AVPacket *in;
> >>>>>>>> -    GetByteContext gb;
> >>>>>>>> -
> >>>>>>>> -    int got_irap = 0;
> >>>>>>>> -    int i, ret = 0;
> >>>>>>>> +    H2645Packet pkt;
> >>>>>>>> +    int i, j, prev_size, ret;
> >>>>>>>> +    int irap_done;
> >>>>>>>>  
> >>>>>>>>      ret = ff_bsf_get_packet(ctx, &in);
> >>>>>>>>      if (ret < 0)
> >>>>>>>>          return ret;
> >>>>>>>>  
> >>>>>>>> +    /* output the annexb nalu if extradata is not parsed*/
> >>>>>>>>      if (!s->extradata_parsed) {
> >>>>>>>>          av_packet_move_ref(out, in);
> >>>>>>>>          av_packet_free(&in);
> >>>>>>>>          return 0;
> >>>>>>>>      }
> >>>>>>>>  
> >>>>>>>
> >>>>>>>> -    bytestream2_init(&gb, in->data, in->size);
> >>>>>>>
> >>>>>>> Is this really better without using an existing bytestream* API ?
> >>>>>>
> >>>>>> If I use the API, then I'd have to call bytestraem2_init for each nal unit. I
> >>>>>> also don't use the bytestream2_get_byte function. It seemed simpler to use the WRITE_NAL macro.
> >>>>>>
> >>>>>> But maybe I've misunderstood your question.
> >>>>>
> >>>>> i had nothing really specific in mind, just the feeling that using none of
> >>>>> the existing APIs there is a bit odd.
> >>>>>
> >>>>> but more specifically, what about the write side ?
> >>>>
> >>>> If I use the bytestream* API, then I'd have to add extra boiler plate code each time I resize the output with av_grow_packet(). I'd have to add something like: 
> >>>>
> >>>>     prev_size = bytestream2_tell_p(...); 
> >>>>     bytestream2_init_writer(...);
> >>>>     bytestream2_skip_p(..., prev_size);
> >>>>     
> >>>> Or maybe the API needs an extra function that reinitializes the pointers but
> >>>> keeps the current state of the writer. 
> >>>
> >>> grow/realloc seems suboptimal to me for any API.
> >>> cant you find out how much space is needed and allocate/grow just once then
> >>> init the bytestream2 on that ?
> >>>
> >>
> >> ok, I'll do it this way.
> >>
> > 
> > After spending so much time on this patch I've just discovered that the
> > hevc_metadata bsf can do the same job as hevc_mp4toannexb. The former also
> > inserts correct parameter sets and so fixes #7799... :(
> > 
> > So instead of:
> > ./ffmpeg -i in.mp4 -bsf:v hevc_mp4toannexb -codec:v copy out.hevc
> > 
> > The user can just run:
> > ./ffmpeg -i in.mp4 -bsf:v hevc_metadata -codec:v copy out.hevc
> > 
> > One difference is that hevc_metadata currently only keeps the base layers
> > (nuh_layer_id == 0), whereas hevc_mp4toannexb copies everything (before my
> > patch). hevc_metadata will have a slightly higher complexity as it parses the full
> > parameter sets.
> > 
> > Can you give me advice on best way to proceed? 
> > 
> > If the nuh_layer_id issue is solved, can we remove the hevc_mp4toannexb filter
> > and automatically insert hevc_metadata filter if the former is selected by
> > the user? 
> > 
> Hello Andriy,
> 
> 1. hevc_metadata inserts parameter sets? That is absolutely new to me.
> How did you check this? (It does not insert parameter sets, but it
> outputs annex b, so the sample from #7799 is left alone by
> hevc_mp4toannexb after hevc_metadata, hence the updated in-band
> extradata is not masked by non-updated header parameter sets).

You are right, it doesn't insert the referenced parameter sets,
but just doesn't overwrite them by the inserting the old extradata. 

I suppose if the media is spliced, then my patch would be still be better
than hevc_metadata because all the parameter are guaranteed to be in the cvs.
But there's still overlap imo.

> 2. hevc_metadata is based upon cbs and cbs for H.264 and H.265 uses
> the h2645_parse functions and the nuh_layer_id issue affects it, too.

> 3. cbs is (currently) slow. That is because it lets the h2645_parse
> functions unescape the whole units and the reassembles them at the
> end. Even though I have an idea for a patch that would greatly speed
> this up, it will never be as fast as an approach that does not rely on
> cbs.

I'm also using ff_h2645_packet_split in my patch, too. It gives the rbsp on all the
NALs. I think it's possible to set some kind of a bound where only the first x bytes are needed.

> 4. Furthermore, cbs is very picky and drops whole packets because of
> minor errors. Not what you would want in an auto-inserted bsf.

I've only glanced over cbs so can't really comment.

Thanks,
-- 
Andriy