[FFmpeg-devel] [PATCH 00/21] New Version

Steve Lhomme robux4 at ycbcr.xyz
Mon Apr 8 10:56:53 EEST 2019


> On April 7, 2019 at 8:56 PM Andreas Rheinhardt via ffmpeg-devel <ffmpeg-devel at ffmpeg.org> wrote:
> 
> 
> Hello,
> 
> thanks for taking the time to review this. Much appreciated.
> 
> Steve Lhomme:
> > Hi,
> > 
> > On 3/27/2019 12:18 PM, Andreas Rheinhardt wrote:
> >> This also changed the handling of unknown-sized elements: They are now
> >> ended whenever an element not known to be allowed in them is
> >> encountered. If we are already on level 1 and encounter an element not
> >> known to be allowed in an unknown-sized segment, this is treated as an
> >> indication that an error might have occured. I hope this is fine.
> > 
> > I haven't looked at the code yet, but an unknown element doesn't mean
> > it's an upper level element. I think it should be either skipped or
> > considered as bad data. If it's a known element but complitely
> > misplaced it should not be going upper in the hierarchy. Only when a
> > valid upper element it should go up in the hierarchy.
> > 
> 1. Actually the current approach is equivalent to considering unknown
> elements in an unknown-sized element as bad data: If a new ID isn't in
> the current syntax list, then we treat this as an indication that the
> current level ended and test with the next higher level and so on
> until we arrive at level 1. If it then isn't a valid level 1 element,
> then it is being treated as an error.
> 
> 2. But as you have said (in your last mail), this approach won't work
> with extensions like RAWCooked. So a check for whether an element is
> from a higher level needs to be implemented and unknown elements with
> finite size should be skipped.
> 
> 3. But there are practical complications:
> a) If an error occurred, then it is possible that we encounter garbage
> that looks like a very big unknown element. I'd image that trying to
> skip it might very well lead to problems, in particular in case of
> livestreaming. So there should be a sanity check for this, i.e. very
> big unknown elements should be considered as errors.

In libebml when we read an element we read the ID and length at once and verify that the length is valid for that ID (based on its type) and its parent (doesn't go over the boundary if there is one). If an element is unknown (known at a different level is considered unknown) with a valid size then we either keep it as such or consider it's an error (that's a flag when reading telling which behavior we want). That's basically like choosing between a strict respect of the specs or allow extensions/customization. By default the more relaxed approach should be used in a reader, so keeping the elements as unknown.

If the direct parent (Cluster) doesn't have a known size and you encounter some unknown IDs, you have no way to know if the data are correct or not. Given unknown length/live streaming already imposes restrictions on what elements are allowed, it would be OK to assume that anything outside of known elements is an error. But considering it's bad data and wait for known data would work as well. In any case you're not going to make use of the data (unlike libebml which may be used to track/report such cases) it doesn't really matter which way you decide to use.

> b) And it is also possible that we encounter quite a few possibly
> legal unknown elements (that are skipped) in a row. I think it's
> reasonable to view this as an indication that an error occurred, too.
> How long should these chains of unknown elements be for it to trigger
> this error? Three? Four? Five?
> c) In both cases, resyncing should begin at the last known good
> position, not at the current position.

You can't resync during live streaming which is what the unknown length feature is for. But then I don't know if the current demuxer code is ready for that. In VLC I had to use a stream filter disabling seeking to make sure we never try to seek during live streaming.

> Do you agree with me regarding 2. and 3.?

Yes, with my additional remarks.

> 4. Would it actually be legal for an extension like RAWCooked to
> implement new potentially unknown-sized elements? I haven't found
> anything on this topic that says it is illegal, so I presume it is
> legal. But it shouldn't be IMO if it wants to be an extension that can
> be played by normal players (that don't know these extra elements), as
> unknown-sized elements simply can't be skipped. I wouldn't know how to
> handle them other than resyncing to the next known level 1 element, so
> it would be nice not to have to worry about unknown master elements
> with unknown size.

That's probably something to discuss on the CELLAR mailing list. I don't have a strong opinion on this. As said above it shouldn't impact much playback since in both cases you'd drop the data and find the following proper data.

> >>
> >> Dale's sample "bear-320x240-live.webm" btw has cues at the end that use
> >> unknown-sized elements (wastefully coded on eight bytes) for
> >> CuePoints and
> >> CueTrackPositions which is spec-incompliant. They are not referenced by
> >> a SeekHead and so can't be used for seeking, but if they were, neither
> >> current FFmpeg nor FFmpeg with my patches applied would be able to use
> >> them. Are such files common (this is the first such file I ever saw)?
> >> If so, I could probably make it work.
> > 
> > If CuePoints are not referenced by the SeekHead at the front of the
> > file (or the indirect SeekHead) they are useless anyway.
> > 
> I know, but should I try to support unknown-sized elements other than
> segments and cluster at all?

It should not break reading the file, yes. Until the core Matroska specification is settled the main one is still the one on matroska.org (and the WebM website) and there are surely files that made use of this until now.

> >>
> >> I have cced Steve for this (I didn't the first time, because I
> >> thought that he (as a maintainer) would also be a subscriber to this
> >> list).
> > 
> > I am subscribed but not the maintainer. In fact I know little about
> > this code.
> > 
> I didn't want to say that you are maintainer of the Matroska demuxer;
> but you are listed as maintainer for certain hw video accelerations stuff.
> 
> >> Oh, and I did not check with Valgrind that the new lacing code doesn't
> >> read uninitialized data. I don't even know how to use Valgrind. I just
> >> read the code. If someone more knowledgeable than I could please test
> >> it...
> > 
> > You might also use LLVM with ASAN (address sanitizer) it's helpfull
> > for this (or so I have been told).
> 
> Windows 7 not supported.

Then it should work with the MingW toolchain than Martin has created:
https://github.com/mstorsjo/llvm-mingw/

I have a guide on how to use it to build VLC with it (including ffmpeg in the contribs) if you need any guidance on how to use it
https://github.com/robUx4/vlc-msys/tree/llvm-mingw64


> - Andreas
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request at ffmpeg.org with subject "unsubscribe".


More information about the ffmpeg-devel mailing list