[MPlayer-dev-eng] [PATCH] AVParser for audio support

Thu Aug 27 00:56:31 CEST 2009

On Thu, Aug 27, 2009 at 12:19:44AM +0300, Uoti Urpala wrote:
> On Wed, 2009-08-26 at 22:09 +0200, Reimar Döffinger wrote:
> > On Wed, Aug 26, 2009 at 10:52:48PM +0300, Uoti Urpala wrote:
> > > How about doing the
> > > parsing in ad_ffmpeg? Is there a reason to expect a practical benefit
> > > from doing it elsewhere?
> > 
> > Well, what is the benefit of doing it in ad_ffmpeg? My incomplete attempt
> > looked at least as bad and won't even work with mencoder an -oac copy.
> 
> At least it should avoid the "if (ds == ds->demuxer->audio)" testing and
> creation of the parser in the middle of a ds_add_packet() call,

Where else to create it? There's no perfect solution I can find, when
creating the stream or somewhen after wastes resources with many
audio/video streams that might never play.
When switching audio as far as I can tell would need a special case
for the default stream.
Neither do I see a good solution to avoid the "if (ds ==
ds->demuxer->audio)" though I don't know what the particular thing you
want to avoid is, should it work with video, too, or some other thing?

> allow sane(r) handling of any possible error cases. The
> av_parser_parse2() documentation also mentions some flush-at-EOF
> functionality and that cannot really be supported at the packet add
> side.

You could probably hack a call to packet_add before concluding it is
really EOF. Doing everything while reading packets means you need an
extra buffer either for the extra generate packets or for the
incompletely processed packet (that is the same as for doing it in
ad_ffmpeg).

> > > Even if it's needed for multiple uses either
> > > adding a separate parsing function or doing parsing in demuxer.c
> > > functions when taking packets _out of_ the queue (so the information
> > > from the demuxer is kept intact in the queue) could be preferable.
> > 
> > On the other hand the information before parsing is unlikely to be really
> > too correct (particularly pts/dts). In case of AC3 in AVI there is also the
> > annoyance that packets seem to vary greatly in size (and thus duration)
> > before parsing.
> 
> Why is that an annoyance? BTW in that case it's probably better to only
> set the pts for the first parsed packed corresponding to a container
> packet and leave it unset for the rest (instead of creating multiple
> packets with the same pts like your current patch does).

No, the correct solution is to use the correct pts/dts (don't actually
know which) from the parser I just hadn't yet figured out how the API
for that works.
Same is probably true for pos, though that might depend on what exactly
pos is supposed to be.
And it is an "annoyance" because the varying duration in some parts of
MPlayer at least in the past sometimes cause "timestamp jumps". It may
have been in something so unimportant as the status line.

> > > The new version you just posted fixes the pos/pts field loss that
> > > happened due to creating new packets, but doing special-case copies of
> > > those fields is still ugly.
> > 
> > There is a clone_packet function or something like that already, but it
> > seems it can't use a different size. Maybe there is a function that
> > does the right thing, if not it should probably be added.
> > The patch has not really reached the "make the code look nice" stage,
> > though I'd of course appreciate if someone else does that part.
> 
> Doing the parsing in ad_ffmpeg would avoid the need to attempt
> implementing a general-case one-to-many packet duplication function.

Well, AVParser is supposed to be that one-to-many thing already, and
deficiencies in that regard should probably be fixed in FFmpeg.
Some issues I saw with it actually don't exist, though it has 2
disadvantages:
1) it does not work with video or subtitles (mine only doesn't because
it has to differentiate between sh_audio_t/sh_video_t)
2) it does not work with mencoder
3) you have to buffer the last packet in case it was so large you
can't decode all of it

> Your current patch also creates bogus 0-size packets when the parser
> reads the end of a container packet without outputting anything.

Well, I guess it's obvious I am still quite clueless about the AVParser
API.
Anyway, I updated it in a way that hopefully makes it easier to move it
around.
-------------- next part --------------
Index: libmpdemux/demux_ts.c
===================================================================

--- libmpdemux/demux_ts.c	(revision 29551)
+++ libmpdemux/demux_ts.c	(working copy)
@@ -305,6 +305,7 @@
 		if(sh)
 		{
 			const char *lang = pid_lang_from_pmt(priv, es->pid);
+			sh->needs_parsing = 1;
 			sh->format = IS_AUDIO(es->type) ? es->type : es->subtype;
 			sh->ds = demuxer->audio;
 
Index: libmpdemux/demux_mpg.c
===================================================================
--- libmpdemux/demux_mpg.c	(revision 29551)
+++ libmpdemux/demux_mpg.c	(working copy)
@@ -270,6 +270,7 @@
     sh_audio_t* sh_a;
     new_sh_audio(demux,aid);
     sh_a = (sh_audio_t*)demux->a_streams[aid];
+    sh_a->needs_parsing = 1;
     switch(aid & 0xE0){  // 1110 0000 b  (high 3 bit: type  low 5: id)
       case 0x00: sh_a->format=0x50;break; // mpeg
       case 0xA0: sh_a->format=0x10001;break;  // dvd pcm
Index: libmpdemux/stheader.h
===================================================================
--- libmpdemux/stheader.h	(revision 29551)
+++ libmpdemux/stheader.h	(working copy)
@@ -32,6 +32,10 @@
   unsigned int format;
   int initialized;
   float stream_delay; // number of seconds stream should be delayed (according to dwStart or similar)
+  // things needed for parsing
+  int needs_parsing;
+  struct AVCodecContext *avctx;
+  struct AVCodecParserContext *parser;
   // output format:
   int sample_format;
   int samplerate;
Index: libmpdemux/demuxer.c
===================================================================
--- libmpdemux/demuxer.c	(revision 29551)
+++ libmpdemux/demuxer.c	(working copy)
@@ -330,6 +330,10 @@
     free(sh->wf);
     free(sh->codecdata);
     free(sh->lang);
+#ifdef CONFIG_LIBAVCODEC
+    av_parser_close(sh->parser);
+    av_freep(&sh->avctx);
+#endif
     free(sh);
 }
 
@@ -409,7 +413,7 @@
 }
 
 
-void ds_add_packet(demux_stream_t *ds, demux_packet_t *dp)
+static void ds_add_packet_internal(demux_stream_t *ds, demux_packet_t *dp)
 {
     // append packet to DS stream:
     ++ds->packs;
@@ -429,6 +433,105 @@
            ds->demuxer->video->packs);
 }
 
+#ifdef CONFIG_LIBAVCODEC
+static void allocate_parser(AVCodecContext **avctx, AVCodecParserContext **parser, unsigned format)
+{
+    enum CodecID codec_id = CODEC_ID_NONE;
+    extern int avcodec_initialized;
+    if (!avcodec_initialized) {
+        avcodec_init();
+        avcodec_register_all();
+        avcodec_initialized = 1;
+    }
+    switch (format) {
+    case 0x2000:
+    case 0x332D6361:
+    case 0x332D4341:
+    case MKTAG('d', 'n', 'e', 't'):
+    case MKTAG('s', 'a', 'c', '3'):
+        codec_id = CODEC_ID_AC3;
+        break;
+    case MKTAG('E', 'A', 'C', '3'):
+        codec_id = CODEC_ID_EAC3;
+        break;
+    case 0x2001:
+    case 0x86:
+        codec_id = CODEC_ID_DTS;
+        break;
+    case 0x55:
+    case 0x5500736d:
+    case MKTAG('.', 'm', 'p', '3'):
+    case MKTAG('M', 'P', 'E', ' '):
+    case MKTAG('L', 'A', 'M', 'E'):
+        codec_id = CODEC_ID_MP3;
+        break;
+    case 0x50:
+    case MKTAG('.', 'm', 'p', '2'):
+    case MKTAG('.', 'm', 'p', '1'):
+        codec_id = CODEC_ID_MP2;
+        break;
+    }
+    if (codec_id != CODEC_ID_NONE) {
+        *avctx = avcodec_alloc_context();
+        if (!*avctx)
+            return;
+        *parser = av_parser_init(codec_id);
+        if (!*parser)
+            av_freep(avctx);
+    }
+}
+
+static void get_parser(demux_stream_t *ds, AVCodecContext **avctx, AVCodecParserContext **parser)
+{
+    sh_audio_t *sh_a = ds->sh;
+    *avctx  = NULL;
+    *parser = NULL;
+
+    if (ds != ds->demuxer->audio)
+        return; // we only support audio for now
+    if (!sh_a->needs_parsing)
+        return;
+
+    *avctx  = sh_a->avctx;
+    *parser = sh_a->parser;
+    if (*parser)
+        return;
+
+    allocate_parser(avctx, parser, sh_a->format);
+    sh_a->avctx  = *avctx;
+    sh_a->parser = *parser;
+}
+#endif
+
+void ds_add_packet(demux_stream_t *ds, demux_packet_t *dp)
+{
+#ifdef CONFIG_LIBAVCODEC
+    AVCodecParserContext *parser;
+    AVCodecContext *avctx;
+    get_parser(ds, &avctx, &parser);
+    if (parser) {
+        int len = dp->len;
+        int pos = 0;
+        while (len > 0) {
+            uint8_t *parsed_start = dp->buffer + pos;
+            int parsed_len = len;
+            int consumed = av_parser_parse2(parser, avctx, &parsed_start, &parsed_len,
+                                            dp->buffer + pos, len, dp->pts, dp->pts, dp->pos);
+            pos += consumed;
+            len -= consumed;
+            if (parsed_len) {
+                demux_packet_t *dp2 = new_demux_packet(parsed_len);
+                dp2->pos = dp->pos;
+                dp2->pts = dp->pts; // should be parser->pts but that works badly
+                memcpy(dp2->buffer, parsed_start, parsed_len);
+                ds_add_packet_internal(ds, dp2);
+            }
+        }
+    } else
+#endif
+    ds_add_packet_internal(ds, dp);
+}
+
 void ds_read_packet(demux_stream_t *ds, stream_t *stream, int len,
                     double pts, off_t pos, int flags)
 {
@@ -527,6 +630,23 @@
             break;
         }
         if (!demux_fill_buffer(demux, ds)) {
+#ifdef CONFIG_LIBAVCODEC
+            AVCodecParserContext *parser;
+            AVCodecContext *avctx;
+            get_parser(ds, &avctx, &parser);
+            if (parser) {
+                uint8_t *parsed_start = NULL;
+                int parsed_len = 0;
+                av_parser_parse2(parser, avctx, &parsed_start, &parsed_len, NULL, 0, AV_NOPTS_VALUE, AV_NOPTS_VALUE, 0);
+                if (parsed_len) {
+                    demux_packet_t *dp2 = new_demux_packet(parsed_len);
+                    dp2->pts = parser->pts;
+                    memcpy(dp2->buffer, parsed_start, parsed_len);
+                    ds_add_packet_internal(ds, dp2);
+                    continue;
+                }
+            }
+#endif
             mp_dbg(MSGT_DEMUXER, MSGL_DBG2,
                    "ds_fill_buffer()->demux_fill_buffer() failed\n");
             break; // EOF
Index: libmpdemux/demux_avi.c
===================================================================
--- libmpdemux/demux_avi.c	(revision 29551)
+++ libmpdemux/demux_avi.c	(working copy)
@@ -60,6 +60,7 @@
         sh_audio_t* sh;
 	avi_priv_t *priv=demux->priv;
         sh=demux->audio->sh=demux->a_streams[stream_id];
+        sh->needs_parsing = 1;
         mp_msg(MSGT_DEMUX,MSGL_V,"Auto-selected AVI audio ID = %d\n",demux->audio->id);
 	if(sh->wf){
 	  priv->audio_block_size=sh->wf->nBlockAlign;