[FFmpeg-devel] [PATCH 0/5] Fix mp3 gapless support (second try)

wm4 nfxjfg at googlemail.com
Wed Apr 15 15:13:59 CEST 2015


On Wed, 15 Apr 2015 13:47:48 +0200
Michael Niedermayer <michaelni at gmx.at> wrote:

> On Wed, Apr 15, 2015 at 01:32:11PM +0200, wm4 wrote:
> > On Wed, 15 Apr 2015 13:16:20 +0200
> > Michael Niedermayer <michaelni at gmx.at> wrote:
> > 
> > > On Wed, Apr 15, 2015 at 01:12:27PM +0200, wm4 wrote:
> > > > On Wed, 15 Apr 2015 12:43:36 +0200
> > > > Michael Niedermayer <michaelni at gmx.at> wrote:
> > > > 
> > > > > On Wed, Apr 15, 2015 at 11:08:02AM +0200, wm4 wrote:
> > > > > [...]
> > > > > > the start of the file. (Seeking to anywhere else likely won't work,
> > > > > > because libavformat tries to use the imperfect xing toc, instead of
> > > > > > scanning the frames.)
> > > > > 
> > > > > btw, you can disable the imperfect xing toc for seeking with
> > > > > "-usetoc 0"
> > > > 
> > > > But that breaks VBR even more, shouldn't it?
> > > 
> > > libavformat should build a index by linearly scanning the file
> > > up to the point where one seeks to, so it should work with vbr
> > 
> > And how could this be achieved?
> 
> it should just work not require anything from the user,
> if i just commit the cbr and toc code out i still can seek so the
> fallback seems working

There's also the thing that this would be incompatible with the ffmpeg
philosophy of "streaming". So what happens if the user seeks in a 1GB
VBR mp3? (I seriously had such a case. It caused problems with the xing
toc, since with 100 entries in a 1GB file it can be off by a _lot_.)

Anyway, the main problem is how this would be implemented. The mp3 data
first must go through the parser, which utils.c handles. The parser
essentially tells us the frame duration, from which the PTS of a packet
can be derived. It also tells us at which file position the packet
starts.

So if a seek happens, then utils.c (?) must read packets until the
target position is reached, and adds them to the index. And then we
must be sure that the seek function in the format (i.e. mp3_seek())
actually uses this function. AND we must be sure that it doesn't use
the broken xing toc seek entries. This sounds like a pretty messy extra
interaction between utils.c and mp3dec.c.

This would possibly be useful for other formats, like other raw
audio formats (TTA?), OGG, or unseekable formats like mpeg-ts. But then
it becomes even more complicated. Also, are there codecs involved in
this whose parser doesn't simply determine at which position packets
are split, but also rewrites them by adding or dropping bytes?

(Maybe I'm overthinking it.)

> > 
> > I'm also not very fond of utils.c messing with what the demuxer does.
> > It really made developing and debugging this harder. It should be the
> > other way around, with the demuxer directly being called by the user,
> > and the demuxer invoking generic helpers if it needs to.
> 
> i understand, iam not sure though if it was the other way around
> if that wouldnt lead to other annoyances like if the fallback case
> needs to be changed then all demuxers for which it applies would need
> to be changed
> 
> The part that IMO we are really missing is clear documentation about
> the seeking function interactions

These interactions are just terrible. I seriously can't tell what
happens if libavformat is told to seek. There are too many things that
could happen which depend on too many flags and fallback cases.

Documenting these interactions will probably not help much - what is
needed it reducing this complexity. Yes, it's probably too late for
this; my point is just that we should try not to make it even more
complex.


More information about the ffmpeg-devel mailing list