[FFmpeg-devel] [RFC/PATCH] Pass PRIVATE_STREAM_2 MPEG-PS packets to caller
Richard
peper03 at yahoo.com
Mon Feb 25 17:23:18 CET 2013
On 25/02/13 04:36, Michael Niedermayer wrote:
> On Mon, Feb 25, 2013 at 12:47:47AM +0100, Richard wrote:
>>
>> Specifically, I started with the problem of playing audio DVDs (in
>> the sense of a normal video DVD with primarily only audio). These
>> have a single video frame followed by an arbitrary number of audio
>> frames. At the moment, the demultiplexing code in Myth ends up
>> reading and buffering too much audio data because it is normally
>> slowed down by the availability of video buffers. Now it could be
>> argued that it should be making more use of the timecodes in the
>> audio and video packets but that is also difficult when
>> discontinuities are to be expected and there's no way to ensure the
>> demultiplexer/decoder is informed about the discontinuity ahead of
>> time (ideally 'now' but certainly not afterwards).
>>
>> For my in-depth ideas with respect to single images with audio, I
>> can point you here:
>
>> http://irc.mythtv.org/ircLog/channel/4/2013-01-24:15:41
>
> IRC> peper03: Playing with VobEdit and looking at the PCI and DSI
> IRC> structures here http://dvdnav.mplayerhq.hu/dvdinfo/ shows that we
> IRC> *can* work out how long to show a still frame for.
>
> so AVPacket.duration could be set according to this information?
> also i remember something about sequence end codes and still pictures
> they could be used too if they are there
The duration would be the duration of the video frame, not of the data
packet. The contents of the data packet would allow you to calculate
the duration of the video frame and even then, you couldn't calculate
the total duration. Only the duration until the next data packet.
The sequence end code allows you to determine *which* frame is the last
frame. That can also be done using the 'vobu_se_e_ptm' field in the PCI
packet - whichever works best in any given architecture.
Does it make sense to give a pure data packet a duration? I suppose you
could use the start and end fields of the PCI packet to indicate the
time frame the data applies to. In that case, it would be necessary to
'merge' the two packets together as the DSI packet has no time
information at all (apart from SCR).
>>> for example random buffering + no timestamps just feels wrong also
>>> this feels a bit like a hack, to output a private stream raw like
>>> that. we dont do that for audio or video either, there you get clean
>>> packets one for each frame with timestamps, a codec id and various
>>> other things
>>
>> The problem is that the contents of packets with startcode '0x1bf'
>> is not uniquely defined. In the context of DVDs, it is, but given
>> any MPEG program stream, it isn't. That makes it difficult to
>> implement a parser. The PCI structure indicates the start and end
>> PTS values for the following VOBU, so the start PTS could be used,
>> but the DSI structure only references the system clock reference.
>> It might be possible to write a parser to decode the contents based
>> on 'if this byte is X and that byte is Y and those bytes are all
>> zero, then we've almost certainly got a PCI structure' but I'm not
>> convinced that is that much cleaner. The contents are simply
>> undefined so there'll always be a chance of false-positives and
>> false-negatives.
>
> all the probing code that detects formats & codecs checks if a byte
> is X and another is Y and so on. Whats your point here?
Sorry, my point was that there's no 'magic number' to definitely
identify the contents. All you could do is say 'if the length of the
packet is 980 and the first byte is zero then it's *probably* a PCI
packet'. There are probably other 'sanity checks' that could be
performed to increase the reliability but it's not as clean as having
some sort of defined identifier. If that's good enough for you, I don't
have a problem with it.
> we also return ac3 frames instead of private stream data ...
Yes, that thought came to me as well after I'd written my last reply.
The contents of private stream 1 packets are also undefined. The only
difference is that private stream 1 packets will also contain the usual
header information. So presumably they are determined to be AC3 packets
on the basis of sanity checks, or is there some form of magic number?
So, working on the assumption that the contents can be determined to a
sufficiently reliably degree do you prefer the following:
1) Define codec ID 'AV_CODEC_DVD_NAV' instead of 'AV_CODEC_PRIVATE_STREAM_2'
2) Implement a parser to combine the two packets (PCI and DSI) found on DVDs
3) Set 'pts' to the 'vobu_s_ptm' field in the PCI packet
4) Set 'dts' to AV_NOPTS_VALUE
5) Potentially set 'duration' to (vobu_e_ptm - vobu_s_ptm)
?
As I said, I don't know whether it makes sense to set the duration field
for a pure data packet but I don't have a problem to do it if that's
what you'd prefer.
> IMO
> all things for which proper fields exist like pts/dts/duration/codec_id
> and so on should be set correctly before some raw packets of any
> stream are exported just to get to their correct values.
That's fine by me. From my point of view, it's just a question of what
to do with packets that are defined as 'user-defined'. If you want to
handle them 100% correctly, you can't make any assumptions about their
contents as anyone could use them for any purpose. Almost any attempt
to determine their contents without context *could* fail. All you can
do is pass them on unprocessed to allow the calling application (which
has the context required) can decode them.
On the other hand, you can take the approach that 99% of these packets
ever encountered will contain data in a certain format (in this case for
DVDs). By performing a few sanity checks, you can increase your
confidence even more so that the likelihood of misinterpreting the data
is almost zero.
The first option is purer, as you are following the specifications. The
second is potentially more useful albeit with the proviso that you can't
give a 100% guarantee that no packets will be misinterpreted.
A third option is to allow the calling application to provide the
context required. I don't know whether this is already possible and I'm
not putting it forward as a suggested change, but it would be an
alternative.
Personally, if you prefer the second option, I don't have a problem to
implement it that way. I assume that there aren't many MPEG streams out
there that use these private streams so that a couple of checks should
give sufficient confidence to interpret the contents successfully.
If you are ok with my suggestions above, I'll create a new patch to
parse and merge the packets, setting the fields as required.
Richard.
More information about the ffmpeg-devel
mailing list