[FFmpeg-user] ffprobe audio duration mismatch

Zak ffmpeg-user-email at m.allo.ws
Wed Mar 21 02:56:54 EET 2018


Hello Ned,

I know a fair amount about certain audio codecs and audio file container 
formats, especially MP3 (the codec and the container and ID3 metadata) 
and AAC (codec) and M4A/MP4 (the container and ill-defined metadata 
standard).

As I have discussed in at least one other email to this list recently 
[1], FFmpeg probably reads metadata from the container files and does 
not calculate it by actually analyzing the compressed audio. There are 
also tons of reasons for this metadata to be incorrect in the container 
file. The AAC/M4A ecosystem is much worse about standards compliance 
than the MP3/ID3 ecosystem because there really are no standards at all 
for AAC/M4A metadata, and the audio compression is far more flexible and 
configurable than MP3 audio compression. The de facto popular standard 
is basically "whatever works with iTunes", because iTunes runs on 
several operating systems and Apple is a major source for AAC-format 
audio files, because that is the favorite format on the iTunes Store.

What you ideally want is a software library that can decode the audio 
data itself and deduce the duration. I can suggest Python libraries that 
do this for MP3 audio streams, but sadly I am less familiar with AAC 
audio streams. MediaInfo (software package) is something that might do 
it, because it supports a lot of things, but I don't honestly know.

If knowing the duration in seconds is very important, in the context of 
music, be very careful just reading the metadata in the file. Sometimes 
the metadata was created by looking at the file size and dividing by the 
bitrate. Sometimes this was done by a batch metadata editor after an 
album cover was embedded in the file, so the "duration" gets an extra 4 
seconds because of the JPEG of the album cover, for example. In other 
cases, the target bitrate for a variable bitrate (VBR) AAC stream will 
be used in the division operation, which means that the deduced duration 
is off by however much the audio deviated from the target bitrate. If 
the file is pure silence, the duration will be way off because that 
compresses very small.

[1] Here is the email where I talk about AAC stream metadata, 
specifically bitrate:

https://ffmpeg.org/pipermail/ffmpeg-user/2018-March/039037.html

Subject line was:
Re: [FFmpeg-user] Why _COPYing_ a variable encoded audio (AAC) channel 
will produce a variably encoded one?

Date: Thu Mar 1 02:20:15 EET 2018.

Zak


More information about the ffmpeg-user mailing list