[FFmpeg-devel] [PATCH 3/3] avcodec/h264_parser: use the keyframe heuristics flag

Sun Jul 18 06:01:47 EEST 2021

On 7/17/2021 11:19 PM, Andreas Rheinhardt wrote:
> James Almer:
>> On 7/17/2021 9:30 PM, Andreas Rheinhardt wrote:
>>> James Almer:
>>>> Tag only packets containing with IDR pictures as keyframes by
>>>> default, same as
>>>> the decoder.
>>>> Fixes MP4 and Matroska muxers marking incorrect samples as Sync
>>>> Samples in
>>>> scenarios where this AVParser is used.
>>>>
>>>
>>> 1. Could you please explain where you got the info that Matroska
>>> keyframes need to be ISOBMFF Sync Samples from? Because it's actually
>>> undefined what it exactly is; in case of AV1 (the only codec with a
>>> detailed codec mapping) said mapping allows delayed random access points
>>> to be marked as keyframes (without providing anything to actually signal
>>> that these are delayed random access points), so a key frame is simply a
>>> random access point. And that is how it is de-facto handled with all
>>> other codecs as well. IMO it is ISOBMFF which is the outlier here.
>>
>> It was an assumption considering the Matroska h264 encapsulation is
>> taken verbatim from ISOBMFF. But you're right, MKVToolnix does mark
>> those as key.
>>
> 
> It's not taken verbatim -- Matroska doesn't have an analog to the sync
> samples table; actually, it is not even guaranteed that cues always
> reference keyframes (MKVToolNix even allows to reference all blocks!),
> although (lacking an alternative) demuxers operate under this assumption.
> 
>>>
>>>> Signed-off-by: James Almer <jamrial at gmail.com>
>>>> ---
>>>>    libavcodec/h264_parser.c           | 16 ++++++++++------
>>>>    tests/fate-run.sh                  |  4 ++--
>>>>    tests/fate/ffmpeg.mak              |  2 +-
>>>>    tests/fate/lavf-container.mak      | 12 ++++++------
>>>>    tests/fate/matroska.mak            |  2 +-
>>>>    tests/ref/fate/copy-trac2211-avi   |  2 +-
>>>>    tests/ref/fate/matroska-h264-remux |  4 ++--
>>>>    tests/ref/fate/segment-mp4-to-ts   | 10 +++++-----
>>>>    tests/ref/lavf-fate/h264.mp4       |  4 ++--
>>>>    9 files changed, 30 insertions(+), 26 deletions(-)
>>>>
>>>> diff --git a/libavcodec/h264_parser.c b/libavcodec/h264_parser.c
>>>> index d3c56cc188..532dc462b0 100644
>>>> --- a/libavcodec/h264_parser.c
>>>> +++ b/libavcodec/h264_parser.c
>>>> @@ -344,9 +344,11 @@ static inline int
>>>> parse_nal_units(AVCodecParserContext *s,
>>>>                get_ue_golomb_long(&nal.gb);  // skip first_mb_in_slice
>>>>                slice_type   = get_ue_golomb_31(&nal.gb);
>>>>                s->pict_type = ff_h264_golomb_to_pict_type[slice_type %
>>>> 5];
>>>> -            if (p->sei.recovery_point.recovery_frame_cnt >= 0) {
>>>> -                /* key frame, since recovery_frame_cnt is set */
>>>> -                s->key_frame = 1;
>>>> +            if (s->flags & PARSER_FLAG_USE_KEYFRAME_HEURISTICS) {
>>>> +                if (p->sei.recovery_point.recovery_frame_cnt >= 0) {
>>>> +                    /* key frame, since recovery_frame_cnt is set */
>>>> +                    s->key_frame = 1;
>>>> +                }
>>>
>>> 2. This change means that files that don't use IDR pictures, but only
>>> recovery point SEIs won't have packets marked as keyframes at all any
>>> more; this affects every open-gop video. This is way worse than an
>>> incorrect sync sample list created by the mp4 muxer.
>>
>> I worked around this for the mpegts muxer as Kieran requested, but if
>> Matroska is the same then the situation changes.
>>
>> I can add a user settable demuxer option for the autoinserted parser
>> instead of hardcoding the behavior, and leave the current behavior as
>> the default.
>>
> 
> So the mp4 muxer would create invalid files by default and in case this
> flag is set, it instead creates crippled files (where open gop random
> access points are not marked as such)? Does not sound good to me.
> An actual fix is to make the muxer aware of the difference between IDR
> and ordinary keyframes (there is already a MOV_PARTIAL_SYNC_SAMPLE for
> the latter; and there is already some parsing in case of MPEG-2).

The current code uses this flag to write stps atoms, and at least for 
h264 that's not apparently what should be done (We need roll sample 
groups instead).

> 
> (As much as I like to reuse the parsers for this, I don't really know
> what exactly the parser should additionally return. Should this be
> solved, we might need to add another field to AVPacket (given that the
> new parsing API is supposed to be packet-based, said information should
> be stored there; alternatively, one could also use more of AV_PKT_FLAG_*).

A packet flag would work to signal that the packet contains an ordinary 
key frame, but not to propagate values like recovery_frame_cnt (to write 
roll distance). That would probably require a side data struct.

> We do not even have to wait for the new parser API to get this
> information out of the parser: We can just add new fields to
> AVCodecParserContext; but we need to agree on whether this information
> should be part of AVPacket and if so, how.)

It would need to be part of AVPacket if we do it as something 
AVCodecParsers and demuxers can output and propagate. Even a bsfs 
inserted by the muxer would require whatever we get out of it to be 
attached to the packet. But we could maybe avoid this with a new bsfs 
based parser API where for example reduced scope CBS-style 
codec-specific structs could be offered, and used by muxers in a similar 
way some of them are currently using bsfs.

> 
> - Andreas
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request at ffmpeg.org with subject "unsubscribe".
>