[Ffmpeg-devel] [RFC] Improvement for the odd timestamp generation when parser is in use.
elupus
elupus
Mon Mar 19 01:47:29 CET 2007
Hi,
I noticed some odd behavior with libavformat when AVParser is used to make
complete frames. Timestamps of output packets get's very jumpy and can't
easily be attributed to the original packet they came from.
This post ended up a way bit longer than I expected so if you are only
interested in the proposed change to avformat/parser is to always output the
timestamp of the packet a frame started in, and provide a byteoffset from
that timestamp where frame starts. so player (or avformat if that is ok to
do internally) can correct the timestamp atleast for cbr streams.
It's abit hard to explain what is going on so i'll try it using an example.
Assume we have a cbr stream wich has a timebase set which results in pts/dts
values actually can be interpreted as what byte in the stream we are at (ac3
in avi does this for example and is where i found the issue). Now assume
each packet comming out of the demuxer is of 10 bytes (or duration 10 as our
timebase equates that), however each full frame that parser will find is of
6 bytes. The assumptions comes from a similar situation with ac3 in avi
where ech avi packet was of about 23xx bytes and each ac3 frame of 17xx
something bytes, don't remember the exact figures.
This is the timestamps we then will get out. (best view with fixed font
width)
pk1 i_size: 10 i_dts: 0 o_size: 6 o_dts: - o_adts: - actual: 0
i_size: 4 i_dts: - o_size: - o_dts: - o_adts: - actual:
pk2 i_size: 10 i_dts: 10 o_size: 6 o_dts: 0 o_adts: 0 actual: 6
i_size: 8 i_dts: - o_size: 6 o_dts: 10 o_adts: 10 actual: 12
i_size: 2 i_dts: - o_size: - o_dts: - o_adts: - actual:
pk3 i_size: 10 i_dts: 20 o_size: 6 o_dts: - o_adts: 16 actual: 18
i_size: 6 i_dts: - o_size: 6 o_dts: 20 o_adts: 20 actual: 24
pk4 i_size: 10 i_dts: 30 o_size: 6 o_dts: - o_adts: 26 actual: 30
i_size: 4 i_dts: - o_size: - o_dts: - o_adts: - actual:
pk5 i_size: 10 i_dts: 40 o_size: 6 o_dts: 30 o_adts: 30 actual: 36
i_size: 4 i_dts: - o_size: - o_dts: - o_adts: - actual: -
pk6 i_size: 10 i_dts: 50 o_size: 6 o_dts: 40 o_adts: 40 actual: 42
i_size: data from demuxer (or what is left after previous output)
i_dts: timestamp on packet comming from demuxer (reset to nopts after
anything has been consumd)
o_size: packet out from parser
o_dts: timestamp out from parser
o_adts: timestamp out from libavformat (cur_dts + duration, if parser didn't
give anything)
So, we get timestamps -,0,10,16,20,26,30,40 out from the demuxer. This give
the following dts differences. -,10,6,4,6,4,10. where the 16 and 26
timestamps are invented timestamps based on previous frames timestamp +
duration (cur_dts). This makes is rather hard to know when to trust
timestamps comming out from libavformat.
Now, if framesizes are small, this isn't a huge deal, as the absolute
timestamp error isn't that large. However, in the case of AC3 or DTS frames,
that are to be directly passed to a rendering device, there is trouble, as
you never know what timestamp to use to sync to.
The best way i can see to handle this is to make parser always output the
timestamp of the packet the current frame started in. (currently it only
does this if the demux packet that resulted in previous frame had a
timestamp and it's the first time we use data from it ). This would result
in multiple packets with the same timestamps, wich might not be the best,
but it would atleast be consistant.
Now if the above way is acceptable, by making parser present the number of
bytes after the timestamp that was given, a player could atleast for cbr
streams correct the timestamp of the demux packet. The correction could even
be done in avformat by default, but that might not be good in the case of
stream copy.
I implemented the above in our player, and it works very well, however i'm
not sure how good it is to use the parsers internal variable from the public
interface. The bad use of internal variables could be removed if only parser
would give the byteoffset directly. (this is no code i expect to go into
avformat, only here as a RFC and possible use for somebody who needs to
improve the accuracy of libavformat timestamps)
AVStream* s = m_pFormatContext->streams[pkt.stream_index];
if(s->parser && s->need_parsing && s->codec->bit_rate)
{
// START PARSER PART
AVCodecParserContext* pc = s->parser;
int k = pc->cur_frame_start_index;
for(int i = 0; i < AV_PARSER_PTS_NB; i++) {
if (pc->frame_offset >= pc->cur_frame_offset[k]
&& pc->cur_frame_dts[k] != AV_NOPTS_VALUE)
break;
k = (k - 1) & (AV_PARSER_PTS_NB - 1);
}
// how far after this timestamp are we
int64_t bytes = pc->frame_offset - pc->cur_frame_offset[k];
// END PARSER PART
int64_t offset = av_rescale_rnd(bytes, s->time_base.den*8,
s->codec->bit_rate * s->time_base.num, AV_ROUND_NEAR_INF);
// if the paket this started in has a timestamp, interpolate
from that
if(pc->cur_frame_dts[k] != AV_NOPTS_VALUE)
pkt.dts = pc->cur_frame_dts[k] + offset;
else
pkt.pts = AV_NOPTS_VALUE;
if(pc->cur_frame_pts[k] != AV_NOPTS_VALUE)
pkt.pts = pc->cur_frame_pts[k] + offset;
else
pkt.pts = AV_NOPTS_VALUE;
}
/Regards
Joakim
More information about the ffmpeg-devel
mailing list