[FFmpeg-devel] [PATCH] H.264/AVCHD interlaced fixes
Sat Jan 31 17:10:05 CET 2009
On Sat, Jan 31, 2009, Ivan Schreter wrote:
> Hi *,
> here the patch to correct some problems when working with interlaced
> H.264/AVCHD files. Two primary problems are addressed:
> - lack of key frame support in H.264 decoder (this is also relevant
> for progressive)
> - proper handling of field pictures for interlaced mode (two fields
> coded as separate pictures instead of one picture per frame)
> Please review it and eventually apply it as-is. Note that I won't have
> much time to fix anything in near future, as my first child is going to
> be born in next couple of days. So it would be good to get through with
> this fix before that happens :-)
> How it works?
> To support key frames, SEI message recovery point is decoded and recovery
> frame count stored on the context. This is then used to set key frame flag
> in decoding process.
You are misusing the SEI recovery point semantic.
D.2.7 of ITU H264 says:
"The recovery point SEI message assists a decoder in determining when the decoding process will produce acceptable
pictures for display after the decoder initiates random access or after the encoder indicates a broken link in the
sequence. When the decoding process is started with the access unit in decoding order associated with the recovery
point SEI message, all decoded pictures at or subsequent to the recovery point in output order specified in this SEI
message are indicated to be correct or approximately correct in content. Decoded pictures produced by random access at
or before the picture associated with the recovery point SEI message need not be correct in content until the indicated
recovery point, and the operation of the decoding process starting at the picture associated with the recovery point SEI
message may contain references to pictures not available in the decoded picture buffer."
So, a frame count >= 0 does not mean that the frame is a key frame BUT that
if you reset the decoder, and start by decoding the picture with the SEI and that
you throw away the N first decoded frames (outputed in presentation order), then
for now on, you have acceptable frames for display.
Example with a simple GOP structure with standard I/P, you can have
Type: I P P P I
recovery_frame_cnt: 0 3 2 1 0
I think that the only safe case is when recovery_frame_cnt is 0 and
exact_match_flag is true.
> In the parser, it is used to communicate (synthetic)
> picture type to libavformat's av_read_frame() routine to correctly set key
> flag and compute missing PTS/DTS timestamps.
Missing PTS/DTS can only be correctly recreated if the h264 parser implements
a complete DPB buffer handler.
I/P/B in h264 just specify the tools available, and not at all the frame
order(unlike in mpeg2 and mpeg4 part 2).
For example, you can use B frames instead of P frames without changing the
order of decoding and presentation, the B simply using past references.
> To support interlaced mode, it was needed to collate two field pictures
> into one buffer to fulfill av_read_frame() contract - i.e., reading one
> whole frame per call.
This will limit you to support only a subset of the H264. Nothing prevents
a H264 stream to first encodes 2 tops and then the two bottoms. (I am not
sure to have seen such streams).
> There is one open point, though: Although it seems that top field pictures
> are always preceding matching bottom field pictures, this is not fixed in
> the standard. Current implementation relies on this.
This cannot correctly works, bottom field first video are common.
More information about the ffmpeg-devel