[FFmpeg-devel] [PATCH] AVCHD/H.264 parser: determination of frame type, question about timestamps
Sat Jan 31 23:50:09 CET 2009
On Mon, Jan 26, 2009 at 08:42:17AM +0100, Ivan Schreter wrote:
> >> [...]
> >>>> which is IMHO broken. Of course, we could communicate with it by setting
> >>>> pict_type to FF_I_TYPE for keyframes only (IDR frames and frames after
> >>>> recovery point), for other frames containing I- and P-slices to
> >>>> FF_P_TYPE and for B-frames to FF_B_TYPE. But I don't like it much. Any
> >>>> idea, how to do it correctly without the need to touch other codecs?
> >>> pict_type from the parse context should likely be split in pict_type and
> >>> keyframe
> >> Actually, we already have a flag field on AVPacket coming from the
> >> parser, but compute_pkt_fields() doesn't believe it and resets it based
> >> on pict_type from parse context instead.
> > the parser works with char* buffers not AVPackets
> Yes, sorry, I was referring to av_read_frame(), which returns AVPacket.
> However, why do we need pict_type at all? I/P/B-frames are
> MPEG-specific. Actually, I believe we should change it and return two
> flags - delayed and key frame. This would make it IMHO cleaner and more
> general than testing for pict_type.
i dont mind such a change if its tested a little and works
> > [...]
> >> Of course, we still have the problem of frame doubling/tripling and
> >> having 3 fields per picture, eventually with one of them repeated
> >> (pic_struct codes 5-8).
> > no we do not have a problem with this, we do not and never will duplicate
> > anything we just export the information and the app can do with it what it
> > wants, thats also exactly what we do in mpeg2
> We don't export the information, do we? But you are right, with frame
> doubling and tripling there is no problem - the application code will
> handle it by itself anyway by displaying last frame longer. As for
> having 3 fields per picture, I'm not so sure it will currently work. At
> least timing will be wrong. Consider following:
as ive said we support this stuff in mpeg2, i see nothing fundamentally
different in the h264 case just more obfuscated documentation of it.
> We have a stream with pictures containing (T1/B1/T2==T1), (B2/T3/B3==B2)
> fields. That's two H.264 pictures, but 3 frames. Each av_read_frame()
> should return a packte containing exactly single frame. But we have just
> 2 packets, which need to be returned in 3 calls to av_read_frame(),
> according to API. Further, the DTS must be set correctly as well for the
> three AVPackets in order to get the timing correct. How do you want to
> handle this?
i dont see where you get 3 calls of av_read_frame(),
there are 2 or 4 access units not 3 unless one is coded as 2 fields
and 1 is a frame
> And as already mentioned, the case with (T1), (B1), (T2), (B2), we are
> returning 4 packets via av_read_frame() for 2 frames, which is against
> API. How to handle this? My idea was delaying return from h264_parse,
> until second field also parsed
well, just consider the exampl that timestamps are always associated with
the second field instead of the first. You couldnt associate them with the
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
He who knows, does not speak. He who speaks, does not know. -- Lao Tsu
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Size: 189 bytes
Desc: Digital signature
More information about the ffmpeg-devel