[FFmpeg-devel] AVCHD/H.264 decoder: further development/corrections
Ivan Schreter
schreter
Sun Feb 1 01:04:34 CET 2009
Michael Niedermayer wrote:
> On Sun, Jan 25, 2009 at 08:08:06PM +0100, Ivan Schreter wrote:
> [...]
>
>> Although the decoder itself takes this into account, the interface in
>> libavformat doesn't. Thus, currently only video having full frames per
>> packet decodes really correctly (and this also only with not-yet-applied
>> patch concerning frame types). Reason: av_read_frame() doesn't return
>> whole frames, although it is documented so.
>>
>
> "decoding" of fields and even field/frame mixes works perfectly, and bitexact
> you can try the reference bitstreams ...
> what doesnt work is the timestamps and these cause the user apps o drop and
> duplicate "randomly"
>
That's what I said. Decoder as such works correctly.
>
>> *Alternative solution:* Return field packet from h264_parse()
>> immediately, but somehow tell libavformat that the packet does not
>> represent a full frame and second field has to be read as well. Read it
>> in libavformat, extending the existing packet. Thus, av_read_frame()
>> returns then full frame.
>>
>
> you might want to look at
> svn di -r12162:12161
>
I will. But the problem is, even if av_read_frame() returns proper
timestamps for field pictures (interlaced video coded as sequence of
field pictures as opposed to interlaced frames), it doesn't fulfill its
contract of returning whole frames to the caller. I.e., the caller still
has to call av_read_frame() twice to get one interlaced frame.
> [...]
>
>> Now the question: Which solution is the "right" one? I'd go for the
>> first one or possibly for the alternative. The first proposed solution
>> seems to be most "compatible", since we don't need to extend AVPacket to
>> address the issue.
>>
>> Your opinions? Or eventually a different idea?
>>
>
> The avparser for h264 should take the input timestamps frm the demuxer
> decode all the relevant SEIs and headers and return the correctly
> "interpolated" timestamps.
>
>
> [...]
>
>> Further, I'd propose keeping a small cache of (PTS, position,
>> convergence_duration) triples for frames containing SEI recovery point
>> message, so the seeking around "current" location would be faster.
>> Reason: video editing software, where we often need to seek one frame
>> forward/backward.
>>
>
> see AVIndexEntry
>
>
Yes, already found.
>> So my suggestion is, report picture type I-frame for key frames (which
>> are key frames is discussed above) and report P-frame for all frames
>> containing only P- and I- slices. Other frames containing also B-slices
>> will be reported as B-frames.
>>
>
> this is technically correct i agree, but because it takes time and the
> information is effectively useless, there is no relation beteen pict_type
> and timestamps ...
> we can take a shortcut and just use the type of the first slice
>
>
That's what I implemented in my patch I sent in another mail. Except
with added workaround to address key frames properly via SEI recovery point.
Regards,
Ivan
More information about the ffmpeg-devel
mailing list