[FFmpeg-devel] update data_offset field in format context
Thu Nov 6 18:42:33 CET 2008
On Thu, Nov 06, 2008 at 05:50:46PM +0200, Yoav Steinberg wrote:
> Michael Niedermayer wrote:
> > On Wed, Nov 05, 2008 at 05:03:28PM +0200, Yoav Steinberg wrote:
> >> Hi,
> >> I've come across some instances where the data_offset field of
> >> AVFormatContext isn't updated after opening a file for input
> >> (av_open_input_file). From the comment in the header it seems that the
> >> data_offset field should represent the position in the input where the
> >> header ends and the data begins. In some cases the header parsing done
> >> during file input seems to run to the end of the input and isn't restored
> >> to the position where the data begins, yielding an invalid data_offset
> >> value equal to the file size (specifically this recreates when calling
> >> av_open_input_file on a mov file).
> >> I've add some code which attempts to provide a more accurate data_offset
> >> value for such files based on the index_entries table (if one is
> >> available). This seems to work for me. It'll be cool if this is added to
> >> the trunk or if someone can explain why not to add this.
> >> (My code is attached).
> > This is not a proper solution to the problem, it also adds a obscure and
> > more importantly completely undocumented behavior to index entries.
> > For a proper solution (aka anything that might be accepted into svn)
> > the first step is a full explanation of what is wrong, basically, if
> > it cannot be reproduced exactly its not a full explanation.
> > second would be the question if its easier to fix the affected demuxers
> > or to change the core to guess the offset. Either way all demuxers must
> > be looked at, in the first case to find&fix them in the second to ensure
> > the core change works with all.
> > [...]
> In my specific application (using libavformat) I'm interested in using
> the data_offset field to figure out how much of the file is used for
> data and how much is used for "headers". This is for some general file
> "rating" system which isn't relevant to our discussion. I found the
> data_offsted field useful since it's documented as the "offset of the
> first packet". Problem was that some demuxers leave pb at the end of the
> input after after calling read_header. Since I wasn't sure if changing
> this behavior in each rogue demuxer is a good idea I found another
> solution which should work (and actually does work for my tested cases)
> independently of whether the demuxer seeks back or not after read_header.
> Just as a note this solution was required for "mov" demuxer since its
> read_header reads the file to the end (if possible).
> Question is whether the data_offset is something I should theoretically
> be able to count on, or whether it's just a helper utility for any
> demuxer that wants some place to save the data offset (without adding a
> private field).
> Currently the:
> if (pb && !ic->data_offset)
> ic->data_offset = url_ftell(ic->pb);
> in the core attempts to use the current position if it wasn't set by the
> demuxer, indicating a "best guess" policy. I was attempting in the patch
> to improve the guessing by employing the index entries table when available.
well i didnt write these 2 lines IIRC so i can nt say for sure but not every
piece of common code is a "best guess code"
one very well could see it the other way around, that its factorized code
from demuxers and only executed when its exactly correct.
> I'd be willing to add a data_offset setting in the "mov" demuxer if lack
> of valid data_offset after reading the mov header is considered a bug.
> But I guess that if a valid data_offset is required only if the packet
> reading depends on it then having crap in the data_offset after reading
> the header isn't a bug. And in that case I can't complain...
> What do you think?
I think we should let baptiste who is mov maintainer comment but AFAICS
data_offset has not much meaning for mov. Headers can at least be at the
begin or the end, and possibly even in the middle.
Also a file with data chunks randomly shuffled and the first packet at
the end and last one at the begin should be valid ...
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
I count him braver who overcomes his desires than him who conquers his
enemies for the hardest victory is over self. -- Aristotle
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Size: 189 bytes
Desc: Digital signature
More information about the ffmpeg-devel