[MPlayer-dev-eng] [PATCH] libass: fix parsing of tracks extracted from containers

Sun Sep 21 00:06:00 CEST 2008

On 9/18/08, Michael Niedermayer <michaelni at gmx.at> wrote:
> On Mon, Sep 15, 2008 at 11:32:59PM +0300, Uoti Urpala wrote:
>> On Mon, 2008-09-15 at 16:52 +0300, Ivan Kalvachev wrote:
>> > So the stuff I don't understand.
>> > 
>> > 1. Is "Block's timecode" the only PTS that this packet have?
>> 
>> Yes.
>
> No, it is not and you know that as you admited already that there are
> other time related fields that are not removed by matrska-ass
> ill quote the spec for ivan and others:
> --------------------
[...]
> --------------------
>
> I really honestly think, uoti you are going a little too far with this
> missinformation compaign.
> ASS is defined in the ass spec, it has many timestamps, and i myself
> have seen the karaoke highlight effect so it isnt rare either.
>
> Matroska picks 2 of the timestamp fields and removes them from the ass
> lines, the other timestamps are not changed or removed, this still results
> in a codec & container timestamp mix, that surely will not become any
> less problematic if your idioitc suggestion of changing the "packet"
> timestamp without fixing the bitstream is applied. That is no matter
> which of the suggested systems is used, in all cases do the ass packets
> need to be edited if timestamps change.
>
>
>> > 2. Why do we want the demuxer to mess with the string (aka, data, aka
>> > payload) instead of just passing it. In other words, why do we dump
>> > useful data?
>> 
>> I'm not 100% sure what you mean with this question. Do you mean why
>> Aurelien wants to convert the format from the one normally used by
>> Matroska? I think he wants to "standardize" on his own version of the
>> format and hopes to use a single internal format which doesn't need
>> conversions to/from containers that use his version (though currently no
>> container does and there's little evidence they would in the future).
>
> aurel wants to move the 2 timestamps that matroska removes from standard
> .ass back where the .ass spec says they should be.
> The format aurel wants is described in the specification of .ass files,
> it is not his format. Also exactly this format is used in nut.

So subtitle packet in nut is having pts as packet info and another "start subtitle" line in the data payload. This is kind of ambiguous.

> convertion between .ass and .mkv will need convertion of the format
> whereever it is done, doing it in the mkv demuxer means nut and other
> formats like avi could use the packets directly.

Hum, see bellow.

>> > 3. If "Block's timecode" is stored in AVPacket.pts and "Block's
>> > duration" is stored in AVPacket.convergence_duration, doesn't that mean
>> > we have everything we need for muxing?
>> 
>> Aurelien's use of convergence_duration was mistaken and he should have
>> used another field (probably waited for a patch adding display_duration
>> to be applied); but yes, all the information should be available in the
>> packet fields, and having it in the bitstream is completely redundant
>> for an internal format. 
>
> So you propose that a .ass demuxer removes these fields?
> The fact that .ass and .mkv use a different format leaves no choice
> either one converts or the internal format is ambigous.
> Now if one must convert why should that be the .ass and not mkv demuxer?

I see no problem if .ass demuxer removes fields that it have parsed and stored in 
AVPacket structures (pts,duration,dts=readorder). I would say that this should be 
mandatory.

It would make processing of subtitles a lot more generic and flexible.

For example -  no need to manually mess with the strings, if we need to move or 
reorder subtitles (e.g. copying the second half of a file). The code that is processing 
the subtitles could work with mkv.ass and mkv.srt without need to distinguish or 
know their internal representation. It could also work with more obscure subtitle 
types. The (de)muxers would take care of the way they are stored or/and their 
external representation.

Take this as example. Muxer is getting AVPacket containing subtitle in full {order,start_time.duration,string} text format. What is muxer supposed to do if  corresponding AVPackets are entierly different. Should it parse the text format, check for differences(optional) and then recreate the subtitle packet if necessary?
It should be simpler to avoid parsing of text that have already been parsed once;)

Indeed, the next big problem that could arise is when duration is changed and the
subtitle have karaoke/script/animation in them that contain their own timing
information.

This could be solved in two ways. First is special handling by the main program that
changes the duration. The second is to store the original duration with the subtitle,
and make demuxer or libass correct it to the packet duration.

None of these are simple, but in the second case we would need to create
non-standard mkv.ass format. I don't think this is acceptable.

Summary:
Demuxer must remove fields that it have parsed and stored in AVPacket structure.
Anything else would lead to increasing of code duplication and special handling.

P.S.
Please don't involve personal insults in the discussion.
I don't mind moving the thread to ffmpeg-dev under new name, 
if this is the way to get purely technical discussion.