[FFmpeg-devel] [PATCH v3 3/3] aadec: fix seeking in mp3 content

Karsten Otto ottoka at posteo.de
Tue Jul 3 23:25:36 EEST 2018


TL;DR: I will drop patch 3/3, may rather spend some time investigating why
"ff ee 47 9d“ passes the mp3 header parser. Also, the aa file "index" cannot
be used for frame or chapter detection, unfortunately.

More details inline below.
 
> Am 03.07.2018 um 02:32 schrieb Michael Niedermayer <michael at niedermayer.cc>:
> 
> Signierter PGP-Teil
> On Mon, Jul 02, 2018 at 07:21:43PM +0200, Karsten Otto wrote:
>> 
>>> Am 02.07.2018 um 10:59 schrieb Michael Niedermayer <michael at niedermayer.cc>:
>>> 
>>> Signierter PGP-Teil
>>> On Thu, Jun 21, 2018 at 06:58:26PM +0200, Karsten Otto wrote:
>>>> MP3 frames may not be aligned to aa chunk boundaries. After seeking,
>>>> scan for the next valid frame header. Then truncate the packet, and
>>>> also adjust timestamp information accordingly.
>>>> ---
>>>> libavformat/aadec.c | 33 ++++++++++++++++++++++++++++-----
>>>> 1 file changed, 28 insertions(+), 5 deletions(-)
>>> 
>>> Please see AVSTREAM_PARSE_TIMESTAMPS
>>> 
>>> This codec specific code in demuxers should not be needed
>>> 
>> I tried that before, and you are right that it takes care of timestamp adjustments.
>> 
>> However, after a seek the parsed packet still contains a partial frame before the
>> next full one. I had expected libavformat/mpegaudio_parser.c to detect this
>> situation and discard the fragment, but unfortunately it does not. Instead it passes
>> it unchanged to the codec, which plays it as a pop or even a very ugly BLEEEP -
>> painful while wearing headphones!
> 
> I think you mis-diagnose this at least somewhat
> your code searches for a specific mp3 header, the parser and decoder would
> accept a wider range of mp3 variants.
> But both can choose points that are not mp3 frame starts. (if that is the
> problem you are seeing, iam not completely sure it is)
> 
It took a closer look at what happens when I hear a BLEEP: The packet begins
with a partial frame, starting with the byte sequence "ff ee 47 9d“. Unfortunately,
the mp3 parser indeed accepts this as a valid mp3 header, causing the noise.
By looking for the more restricted header, my patch finds the real next frame at
offset 78.

BTW: Should this sequence actually pass? AFAIK 01 is not a valid MPEG audio
version ID?

> Also is the more restricted header you search for always used or could
> this be failing with some files ?
> 
Good question. So far, all mp3 aa files I tested with matched the format (MPEG 2
Layer III at 32 kbps and 22kHz). I doubt there are other variants, but can’t be sure.

> Either way, looking at the demuxer a bit deeper, theres a TOC list in the
> main header which points to chunks. The one file i found has 12 such chunks
> the first represents the whole file i suspect, the next largest the audio
> data, another one the metadata.
> I guess the remaining 2 large ones could be a cover image and an index.
Correct, seems like all aa files have the TOC, but its entries can be in a different
order in each file. I guess thats why the original aadec.c implementation just
looks for the largest chunk to play.

> I didnt really look at it, but theres a table in there with pairs of 32bit
> values. the first in the file i have goes from 0 to 3 the second starts
> multiple times from 0 and seems monotonly increasing and staying within
> the filesize.
> The sample i have does not store mp3 but it looks like this is a index
> maybe offsets for packets in each of the 3 chapters.
> 
> Please look at the data, if it can be used. It would be much better than
> scaning the file linearly and searching for some byte sequence to find
> packet starts.
> 
Short answer: Sorry, it is not possible to derive frame offsets nor chapter
offsets from the index.

Long answer:
All offsets in the index are the same, and matching the "codec_second_size"
= crypto chunk size, roughly one second of audio:
- 1045 for format 2 (SIPR 8kbps)
- 2000 for format 3 (SIPR 16kbps)
- 3982 for format 4 (MP3 32kbps)
This is different from the respective frame size, which is 19, 20, and 104/105
respectively. For MP3 the frame size and chunk size do not even align
(I assume this was a late retrofit to the format), causing the problems mentioned
above. So, not usable to derive codec frame offsets.

Furthermore, each index starts at offset 0, so they are relative to the start of
their chapter. But they do not indicate the chunk size, though the last chunk
of a chapter is usually shorter then the full size. So, not usable to derive
chapter offsets.

I assume this index could be used to shuffle the file contents as a further
DRM/obfuscation measure. But I have never seen this so far.

> 
> thanks
> 
Cheers, Karsten

> [...]
> -- 
> Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
> 
> Old school: Use the lowest level language in which you can solve the problem
>            conveniently.
> New school: Use the highest level language in which the latest supercomputer
>            can solve the problem without the user falling asleep waiting.
> 
> 



More information about the ffmpeg-devel mailing list