[FFmpeg-devel] [PATCH] fix parsing of broken mp3 streams

Michael Niedermayer michaelni
Wed Apr 22 15:57:03 CEST 2009


On Wed, Apr 22, 2009 at 10:50:10AM +0200, Zdenek Kabelac wrote:
> 2009/4/22 Michael Niedermayer <michaelni at gmx.at>:
> > On Tue, Apr 21, 2009 at 02:31:59PM +0200, Zdenek Kabelac wrote:
> >> 2009/4/21 Michael Niedermayer <michaelni at gmx.at>:
> >> > On Tue, Apr 21, 2009 at 11:14:16AM +0200, Zdenek Kabelac wrote:
> >> >> 2009/4/21 Michael Niedermayer <michaelni at gmx.at>:
> >> >> > On Tue, Apr 21, 2009 at 01:01:04AM +0200, Zdenek Kabelac wrote:
> >> >> >> 2009/4/20 Michael Niedermayer <michaelni at gmx.at>:
> >> >> >> > On Mon, Apr 20, 2009 at 09:37:25PM +0200, Zdenek Kabelac wrote:
> >> >> >> >> 2009/4/19 Michael Niedermayer <michaelni at gmx.at>:
> >> >> >> >> > On Sun, Apr 19, 2009 at 11:18:06PM +0200, Zdenek Kabelac wrote:
> >> >> >> >> >> Hi
> >> >> >> >> >>
> >> >> >> >> >> Here is a small patch that fixes of running out-of-buffer in parsing
> >> >> >> >> >> broken mp3 data stream.
> >> >> >> >> >> This solution is rather a hotfix - better solution would be to check
> >> >> >> >> >> at least one or two next mp3
> >> >> >> >> >> frames in sequence whether they are part of the same audio stream or
> >> >> >> >> >> some random junk
> >> >> >> >> >> which has 0xfffx header inside. With this patch ugly noise could be
> >> >> >> >> >> sometimes noticed.
> >> >> >> >> >>
> >> >> >> >> >> Also questionable is whether it should return -1 if no header is found
> >> >> >> >> >> or rather return skipped
> >> >> >> >> >> bytes and out_size = 0 - as then usually such packet is rescaned
> >> >> >> >> >> multiple times with
> >> >> >> >> >> one-byte step forward...
> >> >> >> >> >>
> >> >> >> >> >> Zdenek
> >> >> >> >> >>
> >> >> >> >> >> - Fix buffer overrun
> >> >> >> >> >> - Properly return parsed bytes together with skipped bytes
> >> >> >> >> >
> >> >> >> >> > please provide a sample so we can confirm the bugfix, the patch
> >> >> >> >> > looks mostly correct though
> >> >> >> >> >
> >> >> >> >>
> >> >> >> >> I've upload just one mp3 dumped stream upload.ffmpeg.org as
> >> >> >> >> junk_at_mp3stream ?directory - together with short text and two patch
> >> >> >> >
> >> >> >> >> - I'm attaching patch for api-example.c ?to easily compare results.
> >> >> >> >
> >> >> >> > i dont care what a modified tool does
> >> >> >> > is there a problem that is reproduceable with ffmpeg or ffplay that
> >> >> >> > your patch fixes?
> >> >> >>
> >> >> >> Patch is fixing mp3 decoder to skip only broken junk inside passed
> >> >> >> data ?while decoding as much mp3 frames as possible and avoid buffer
> >> >> >> over reading - don't ask me which tools are muxing avi streams with
> >> >> >> junk in packets - obviously it some kind of re-synchronization from
> >> >> >> splinting huge avi streams into small chunks....
> >> >> >>
> >> >> >> You could check for your self is to compare the result of extracted
> >> >> >> wav size via api-example and then do
> >> >> >> the same with ffmpeg -i junk.mp3 ?o.wav - you might observe small
> >> >> >> difference 4027436 != 4018220
> >> >> >> To do my homework and complete the list: mplayer -ao pcm:file=wav
> >> >> >> junk.mp3 - creates 4022830 - but IMHO it decodes some broken packets
> >> >> >> at the begining)
> >> >> >>
> >> >> >> (btw the patch for api-example should be probably commited into svn as well...)
> >> >> >> Usually such badly muxed sample streams are way to small to notice
> >> >> >> significant desynchronization.
> >> >> >
> >> >> > your original patch looked fine but after that you just talk nonsense
> >> >> > apiexample is a example for codecs, not containers, mp3 must be passed
> >> >> > through a demuxer and parser.
> >> >>
> >> >> I knew it would be hard - anyway I'll try once again - please check my
> >> >> original patch
> >> >> and see the mpegaudiodec.c code then please answer me following question:
> >> >>
> >> >> - What will stop parser from checking given buffer for mp3 header tag
> >> >> after the buffer size
> >> >> ?i.e. pass there zero memory area ?- I think decoder shouldn't run
> >> >> behind the given buffer
> >> >> even in the case it contains obviously wrong data - i.e. non-mp3 in this case.
> >> >> (user would have to put false mp3 header after the passed buffer to
> >> >> stop the parser)
> >> >>
> >> >> - If the mp3 packet is found within some offset from the beginning why
> >> >> it should return
> >> >> the size of parsed packed without the skipped bytes from the start of buffer.
> >> >> (so next parsing will again start in the middle of previous mp3 packet)
> >> >>
> >> >> - Explain how the libavformat/mp3.c:mp3_read_packet() solve the problem?
> >> >> (speaking of MP3_PACKET_SIZE - theoretical mythical max size of mp3
> >> >> chunk is 1440)
> >> >
> >> > Iam not disputing that the original patch possibly fixes a issue, i
> >> > am asking if you have a test case so we can test it.
> >> >
> >> > either
> >> > A. the patch has no effect at all on ffmpeg & ffplay
> >>
> >> I think I've already shown that we could get a different amount of WAV
> >> samples from particular mp3 audio stream - we might have a discussion
> >> which number is correct - but IMHO ffmpeg tool ?should always try to
> >> get as much as possible original samples from data stream - but I
> >> could be alone...
> >
> 
> At the beginning I want to state that I always admire your work - and
> I'm really sorry it takes me so much time to explain the problem here,
> but as I think I'm right I'd prefer to not give up until we will
> properly understand each other and of course if I'm wrong, I want to
> understand why...
> 
> > so does the stream you posted decode differently with unmodified ffmpeg
> > vs. with the original patch?
> > IIRC you only spoke of what the hacked apiexample does
> 
> I think I've expressed few times already that ffplay skips full audio
> chunk (ffplay.c line ~1593) when it sees broken chunk i.e. mp3 chunk
> is crossing frame boundary. In my api-example.c change is modification
> which shows how I'm seeing the proper way of decoding of the byte mp3
> stream stored in .avi chunk - when it finds error it rolls forward in
> this stream and it could find a mp3 frame  which is currently lost by
> plain full chunk skip. That is when the 'small' difference comes from.
> 
> api-example.c should be showing the API usage to the user of FFmpeg
> library - and if avcodec_decode_audio3() is supposed to return number
> of consumed bytes from buffer - it should work this way in the
> api-example code - Somehow I do not fully understand what makes you so
> crazy about this as IMHO my small patch just do it in the way the API
> is supposed to work ?

> 
> The current logic mp3 decoder is actually doing also parsers' work to
> match the beginning of mp3 chunk - and there is no mp3 parser for avi
> chunks in use I think. (as the container is  .avi stream - not .mp3
> stream)

avidec.c:
 case CODEC_TYPE_AUDIO:
[...]
                    /* Force parsing as several audio frames can be in
                     * one packet and timestamps refer to packet start. */
                    st->need_parsing = AVSTREAM_PARSE_TIMESTAMPS;


> 
> So what I'm saying is - that mp3 is muxed into .avi as byte stream -
> thus we should not easily give up full chunk if error is found - i.e.
> there could be a person who will put all mp3 chunks into one .avi
> frame - which is possible and perfectly valid

no, its disallowed by the avi spec from MS but 99% of the
muxers ignore it and then start talking about weird bugs in microsofts
implementation of avi in relation to vbr mp3 (for CBR the specs allow
choping up frames randomly into chunks)


> and actually saves some
> space in the .avi tables, but for some players the stream is hardly
> seekable then - and the first error detected in mp3 stream will cause
> lost of full audio track if frame_size gets -1.

the mp3 decoder is supposed to be feeded with individual mp3 frames
It will even tell you that by:
    if(s->frame_size<=0 || s->frame_size > buf_size){
        av_log(avctx, AV_LOG_ERROR, "incomplete frame\n");
        return -1;
    }else if(s->frame_size < buf_size){
        av_log(avctx, AV_LOG_ERROR, "incorrect frame size\n");
        buf_size= s->frame_size;
    }

now it might be that its able to handle the case of several frames in one
piece but thats not to say this is how its supposed to be used


> 
> As I do not want to extended my theories further - I wait until we
> agree here on some conclusion.
> 
> There is number of solutions for this problem - I just wanted to go in
> the way a smallest changes and I also said the fix in ffplay & ffmpeg
> is not one-line change - but it's not so complicated either.
> 
> And one note to VBR I've mentioned previously - you probably
> misunderstand that I mean some problems with reading Xing headers from
> music audio mp3 files - I've only meant playing VBR audio mixed in AVI
> streams (which I probably didn't emphasize enough) - which is
> currently also a bit buggy - check i.e. 13fantavsync.avi in mplayer
> samples site. There are other issues but let's fix them one-by-one.
> 
> >
> >
> >>
> >> > B. there is a file for which behavior changes
> >>
> >> The fact that it's not running out of memory bounds when the mp3
> >> header could not be found in the given buffer is probably because
> >> usually lots of other mp3 frames are lying nearby in memory so it will
> >> effective stop - and there are not too many heavily broken stream.
> >
> > maybe, maybe not
> >
> >
> >>
> >> So to answer your questions
> >> A - currently my patch does not influence those tools as they discard
> >> whole data chunk if the error is found.
> >> B - artificial file could be probably created which will show problem
> >> from scanning data past the buffer - and generate coredump - though
> >> it's not probably so simple to ensure memory layout that no mp3 header
> >> will not be found past the allocated header.
> >
> > so you claim 2 mutually exlusive theorems are true?
> > sorry
> > either ffmpeg does or does not change with some input
> 
> 
> FFmpeg with my proposed patch doesn't change in the way, it produces
> same amount of audio samples.
> The only change is - it will not crash due to scanning past the input buffer
> 
> I think I cannot say it more clearly ?

so you have no sample that crashed ffmpeg nor that makes ffmpeg produce
diferent output


[...]
> >
> > If OTOH the patch author talks confused crap and submits other patches that
> > are broken then id tend to want to take a deeper look at the patches that
> > look correct on first sight from him. And that takes more time or a little
> > help from the patch author (that is a reproduceable testcase)
> 
> I really hope we will at some point understand each others view on this issue.

yes, iam sure we will

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

I have often repented speaking, but never of holding my tongue.
-- Xenocrates
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20090422/d81685ef/attachment.pgp>



More information about the ffmpeg-devel mailing list