[FFmpeg-devel] [PATCH 12/18] avformat/hls: parse ID3 timestamps for elementary audio streams

Michael Niedermayer michaelni at gmx.at
Tue Dec 31 05:58:28 CET 2013

On Tue, Dec 31, 2013 at 04:46:37AM +0200, Anssi Hannula wrote:
> 31.12.2013 03:32, Michael Niedermayer kirjoitti:
> > On Tue, Dec 31, 2013 at 03:04:33AM +0200, Anssi Hannula wrote:
> >> 30.12.2013 16:12, Michael Niedermayer kirjoitti:
> >>> On Mon, Dec 30, 2013 at 01:14:26PM +0200, Anssi Hannula wrote:
> >>>> HLS provides MPEG TS timestamps via ID3 tags in the beginning of each
> >>>> segment of elementary audio streams.
> >>>>
> >>>> Signed-off-by: Anssi Hannula <anssi.hannula at iki.fi>
> >>>> ---
> >>>>
> >>>> This is a bit hacky, but I could not see any better way.
> >>>
> >>> this seems to break some hls streams
> >>> for example:
> >>> http://qthttp.apple.com.edgesuite.net/c123pibhargjknawdconwecown/0064/prog_index.m3u8
> >>>
> >>> didnt investigate why
> >>
> >> The ID3 tag in the beginning of each segment seems to contain an
> >> attached picture (APIC) in addition to the timestamp, so the ID3 tag is
> >> too large for the id3 hack to fully intercept (since the hack currently
> >> depends on the ID3 tag fitting in the caller-provided buffer), so the
> >> caller gets 0 bytes.
> >>

> >> The APIC is currently parsed to a static mjpeg stream by
> >> ff_id3v2_parse_apic() in avformat_open_input(). I guess it could
> >> theoretically change mid-stream as well, though it might not currently
> >> happen in practice (no idea, though [1](13) suggests it does not).

dunno about the picture but from reading "Adding Timed Metadata" on:

it sounds like these things can change

> >>
> >> Before the patch it works because avformat_open_input() reads metadata
> >> in the beginning of input. It of course will not detect the mid-stream
> >> metadata changes (in the beginning of segments), if there are such streams.
> >>
> >> I guess there are several alternative ways to fix this:
> >> a) Make the hack properly request additional bytes into its own buffer
> >>    so that the entire ID3 tag can be read, and get the picture shipped
> >>    over to the attached picture stream (and maybe any changed text
> >>    metadata as well?). Not sure how that should work with changing
> >>    pictures, though, as AV_DISPOSITION_ATTACHED_PIC says "single
> >>    packet". Or alternatively just not create it is an attachment in the
> > 
> >>    first place, but as a regular video stream. Well, I guess for now
> >>    I could just ignore all data after the first ID3 tag, and deal
> >>    with the changing stuff if it appears...
> > 
> > slightly better than ignoring would possibly be to compare to check
> > if something changed and print a warning that its unsupported if it
> > changes
> Right.
> > 
> >> or
> >> b) Assume the timestamp we want is in the beginning of the packet (so
> >>    we do not need to request any additional bytes) and then pass the
> >>    full ID3 tag along after extracting the timestamp we wanted. However,
> >>    I guess that is a pretty big assumption, and of course this way we
> >>    miss any metadata in mid-stream ID3 tags (though that is a minor
> >>    problem as such streams may not exist).
> >> or
> >> c) Some entirely different solution to this ID3 timestamp thing
> >>    altogether, preferably something that would also allow us to drop
> >>    the manual recalculation of timestamps in this patch (by leveraging
> >>    the lavf timestamp regeneration code).
> >>   I)
> >>    I already dabbled a bit with a private HLS ID3 tagged audio demuxer
> >>    but the issue was that from the perspective of such a demuxer, the
> >>    position of the ID3 tags is only known by the AVIOContext, and I
> >>    could see no easy way to communicate the position of ID3 tags to
> >>    the demuxer externally (well, I guess I could rather easily just
> >>    access the private demuxer priv_data.. but that would again be
> >>    quite hacky and probably would make no sense to have the HLS ID3
> >>    audio demuxer even registered publicly in allformats.c..)
> > 
> > Iam not sure i understand but
> > Inter demuxer communication is possible by using either
> > AVPacket fields (theres a AVPacket.pos for the file/stream position
> >     for example)
> > side data of AVPackets
> > AVOptions on the demuxers context itself (for this one one needs to
> > be carefull if the value can change multiple times over the lifetime
> > of a demuxer because any fifo / delay on the AVPackets could make
> > its value mismatch with the latest returned packet)
> I need communication in the "opposite" direction than AVPackets go, so
> they are not a good option AFAICS. AVOptions might do it, though, I
> think I was under the impression they are not changeable after-the-fact
> so I didn't consider it before...
> Basically the custom AVIOContext (which has access to AVFormatContext in
> this case) has to inform the demuxer of either position of ID3 data, or
> the timestamp recovered from the ID3 data.
> Since both subdemuxer .read_packet() (which needs the pos) and
> AVIOContext .read_packet() (which knows the pos) have access to
> AVIOContext buffer position, I guess the ID3 data buffer position could
> be transmitted via AVOptions to the subdemuxer (the ID3 tags appear
> usually every 10 seconds or so, so multiple ID3 tags getting buffered in
> aviobuf.c should not be an issue).
> Now the only issue is how to select the hls-id3-audio subdemuxer...
> (a) Have a regular probe() that detects the timestamp in the ID3 tag
>     in the beginning of the stream (ID3 tag needs to be made available
>     in AVProbeData for that, though, presently it is just stripped).
>     Without AVOption assistance the demuxer can only detect the first
>     ID3 tag, but it would still work quite ok (the rest of the packets
>     just get generated timestamps).
>     Then we also either have to
>     (I)   Try to parse every ID3 tag in the probe() to look for the
>           timestamp, or
>     (II)  Assume ID3 tag is uncompressed and just look for
>           "com.apple.streaming.transportStreamTimestamp" in it.
>     (III) Add some lighter-weight option to our ID3 parser to just check
>           if there is a PRIV tag (and maybe the PRIV tag owner string),
>           or
>     (IV)  Have av_probe_input_format3() do ID3 parsing and make the
>           results available in AVProbeData.
> (b) Have it without probe() and have HLS demuxer detect the ID3 tag
>     and select the demuxer manually.
> Sounds to me like (a)(IV) might be the winner here, or WDYT?

iam happy with anything that works and is reasonable simple & clean
i dont think i have a deep enough understanding of id3 tags in hls
to really say which is the best solution

> I'll try to implement such a hls_id3_audio demuxer, I'll see if I
> encounter any huge issues.
> >>   II)
> >>    Hmm, continuing on the subdemuxer idea, maybe a HLS ID3 audio demuxer
> >>    could even handle the segment downloading itself so it would then
> >>    know where to expect ID3 tags, sharing code with the main HLS demuxer
> >>    but without any extra communication between them.
> >>    I don't see how the probing would work, exactly, though... i.e.
> >>    how would the HLS ID3 subdemuxer be selected? Just probing for the
> >>    ID3 timestamp tag in probe() would not be enough, since at that point
> >>    the main HLS demuxer is already handling segments via custom
> >>    AVIOContext, so it would have to handover somehow, which is again
> >>    in the hack category... damn, I thought I had it this time.
> >>
> >>
> >> I guess I'm going to look at implementing (a), unless you can devise a
> >> workable idea along the lines of (c)...
> >>
> >>
> >> [1]
> >> https://developer.apple.com/library/ios/documentation/networkinginternet/conceptual/streamingmediaguide/FrequentlyAskedQuestions/FrequentlyAskedQuestions.html
> >>
> >> -- 
> >> Anssi Hannula
> -- 
> Anssi Hannula
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

He who knows, does not speak. He who speaks, does not know. -- Lao Tsu
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20131231/9333ae36/attachment.asc>

More information about the ffmpeg-devel mailing list