[MPlayer-dev-eng] ASS/SSA discussions

Fri Sep 26 03:42:38 CEST 2008

On Fri, 2008-09-26 at 01:17 +0200, Michael Niedermayer wrote:
> On Wed, Sep 24, 2008 at 02:03:38AM +0300, Uoti Urpala wrote:
> > On Tue, 2008-09-23 at 23:16 +0200, Michael Niedermayer wrote:

> > > besides#2, the ass demuxer in libavcodec will not parse, nor set the
> > > display_duration. Its the ass AVParser that might, it would be nasty
> > > code duplication if demuxers would do it ...
> > 
> > Do you intend to actually implement a .ass demuxer?
> 
> As you ask so nicely, Ill see what i can do. Iam starting to think that
> it might take less time to just implement everything related to ASS than
> continue this discussion.

Maybe if you do a shoddy enough implementation. At your current level of
understanding that seems likely.

> > Would the demuxer set _anything_ at all in the packets then?
> > display_duration is at the same level as start time information. Would
> > you omit both or for some reason parse one but not the other?
> 
> I can just repeat that there will be no random code duplication in ffmpeg,
> and having avi, nut, asf, ... parse the duration would be code duplication.

If those will use the same format (in case they get a standard format at
all) they can use the same function to parse it. Talking about the "code
duplication" of calling the function is silly.

Why are you bringing up those demuxers when talking about .ass? .ass is
clearly a different format and you do have to parse the line to get any
information at all.

> Not to mention that i will try to minimize codec specific code in demuxers.

That's hardly an argument when talking about a .ass demuxer.

> > > It seems you missed my past comments ...
> > > What ffmpeg is heading toward is
> > > * the demuxers return subtitle packets like any other packet 
> > > * the subtite decoder decodes these packets to a common subtitle structure
> > >   (AVSubtitle) containing utf-8 text, timestamps/durations, positions,
> > >   effects, bitmaps, font references, ...
> > > * A common subtitle renderer renders these so they can be displayed or
> > >   a subtitle encoder encodes them to a possibly diferent format again.
> > 
> > What do you mean by "heading toward"? Is someone going to actually
> > implement this? Who? I've seen no indication of such work being done.
> 
> It will be implemented as subtitle decoders are implemented. You surely
> can see that the existing decoders alraedy use AVSubtitle and AVSubtitle
> is a vector based container not a sigle bitmap.

I see no work toward anything applicable to SSA. 

> > > Now this is not so much different from video and audio
> > > the decoder converts a codec specific bitstream into a common and simple
> > > representation (a bitmap or a bunch of PCM samples).
> > > 
> > > Within this framework, subtitles are trivially editable, not only the
> > 
> > They won't be trivially editable at least if you want to store the
> > result in an existing format.
> > 
> > There is no "simple representation" for all SSA/ASS effects other than
> > naming the specific effect. Audio codecs can be decoded to PCM in some
> > sample format and most video codecs can be decoded to bitmaps, but
> > subtitles are more like vector graphics. There is no simple format that
> > could accurately represent every input.
> 
> Iam not interrested in what you would prefer cannot be done or did not exist.

What I stated were facts, not opinions or preferences. Do you claim that
some of those facts were false, or are you saying that you are not
interested in what the facts are? (Your recent behavior in this thread
does give that impression.)

> > > This is certainly not true, as it is not done currently by any (de)muxer 
> > > and doing it would add very significant and complex code to every
> > > demuxer. And yes iam speaking about the general case here, not just ass, if you
> > > mean just ass, then i honestly do not understand why it should be a special
> > > case.
> > 
> > The reasons why SSA/ASS subtitles are different from a standard video
> > codec have already been explained in the thread (a couple of times). But
> > I'll try once again:
> 
> Ill not try to repeat the same awnser again though, you can look it up in the
> thread.

You have not given any answers that would show the issues do not apply.

1) The reason why SSA/ASS differ from usual video packets (two timed
events per packet).
You have not given any "answer" that would contradict this. Your only
attempts have been stupid excuses interpreting interlaced frame decoding
as "timed events".

2) Your proposed format is unsuitable for muxing because is uses
absolute timestamps.
Earlier in the thread you did seem to understand that this is a serious
flaw. However your only "answer" was to propose reinterpreting the
meaning of the fields so that duration would be stored by setting the
start and stop to some values duration apart, instead of using the
normal semantics of those fields. This would be incompatible with
anything using the normal semantics and so strictly inferior to storing
just a duration field (which would equally differ from lines in .ass
files, but would make sense as a way of storing duration). Then you said
the "start" field would always have to exactly match the container pts
to maintain some level of compatibility with .ass line semantics. This
in turn has obvious problems with duplication of data, requiring extra
work to rewrite packets after any changes (and you said you wanted to
avoid extra parsing?), consistency of player behavior when the values do
not match (handling this robustly would require more parsing), and it's
also completely inconsistent with the way video codecs are treated.

3) Your proposed format cannot be used to represent tracks from Matroska
without losing information.
You haven't given anything that could be called an answer. The only
related things you've said have been vague comments about how the
ReadOrder information wouldn't be that useful anyway, clearly without
much clue about whether or how people use it. Are you explicitly saying
that in your opinion ReadOrder information should always be treated as
completely worthless, should not be provided to programs that try to use
the libavformat Matroska demuxer, and should be destroyed when remuxing
Matroska files with FFmpeg? Or not?

> > Some video codecs have their own timestamps. But those are NOT used for
> > timing when muxed in a normal container.
> 
> Are you the maintainer of mplayers av sync code? If so i would have
> suspected that you know how mpeg2 in mpeg-ps works. But this comment
> makes it clear that you do not.
> The durations in the mpeg2 stream are certainly used when muxed in mpeg-ps.

See above, and mpeg-ps is also hardly a generic container, nor a
container whose timing properties would be worth emulating.