[FFmpeg-devel] [RFC] AST subtitles
Nicolas George
nicolas.george at normalesup.org
Wed Nov 28 14:40:30 CET 2012
Le quintidi 5 frimaire, an CCXXI, Clément Bœsch a écrit :
> Yes that's exactly what I was trying to do, but I hit two problems
> already, which I don't know how to solve:
>
> - we need to have an ASS encoder context to do that (and you can't just
> call the encode function directly because you need the global styles
> information), and this gets kind of ugly quickly.
>
> - the subtitles encoding API is limited: the encoder takes a buffer +
> buffer size instead of an AVPacket, which makes it kind of awkward to
> use within the decoder. The API needs a lift as you said at the end of
> your mail.
It can be done the other way around: move the necessary code from the ASS
encoder to a private API ff_styled_text_to_ass_line(), and then call this
private API both from avcodec_decode_subtitles2() and from the ASS encoder.
> > > union {
> > > char *s; ///< must be a av_malloc'ed string if string type
> > > double d;
> > > int i;
> > > int64_t i64;
> > > uint32_t u32;
> > > void *p; /**< pointer to allocated data of an arbitrary
> > > size (chunk type dependent) */
> > > };
> > > int p_nb; /**< number of entries in p, can be used for
> > > variable sized data */
> > > } AVSubtitleASTChunk;
> >
> > The "p_nb" field name is inconsistent.
> >
>
> nb_p?
nb_somehting. I realize that the union field in AVSubtitleASTChunk does not
have a name: I did not think it works, but it does with gcc. Is it standard?
> I'm already doing a doubling reallocation. Sorry, I should have pasted the
> code:
No problem. That looks fine. I always forget the trick of testing if the
size is a power of 2.
> Yes, I don't know yet, we'll indeed likely need to allocate the
> AVSubtitleASTSettings into the AVCodecContext in the decoder init
> callback.
Yes, that is exactly my understanding of the problem.
> I'd say the decoder will have to make its own list of profiles depending
> on the set of styles it expects. I don't really want to make the users and
> encoders deal with with complex trees of styles and inheritance processes.
> That information won't be restored properly in most of the output
> subtitles, so since we will likely have to "flatten" this stuff before
> encoding, I'd say it's up to the decoder to make the stupid markup it is
> parsing accessible & simple for any encoder.
Maybe. I am not completely sure that subtitles "rectangles" should not be
able to point to several global styles. That would probably solve most of
the non-trivial cases.
> The more I do this, the more I realize there is a lot of things to solve
> before that…
It is a good start.
> The decoding function is already taking an AVPacket and outputting an
> AVSubtitle. The encoding function on the other hand is pretty much a relic
> of the past, with one buffer + a buffer size as stated at the beginning of
> this mail (look at how ugly it's done in ffmpeg…). This is IMO what we
> should acknowledge first, but it's not simple.
>
> There are actually all sort of factors to deal with, so let me summarize:
>
> - as you just said, making AVSubtitle heap allocated is one step ahead,
> but it's actually not blocking for what I'm doing right now: the
> SUBTITLE_AST means to add a field in the rectangle structure (which is
> allocated internally), not AVSubtitle, so it shouldn't really matter.
> Though, it might be required later, so feel free to do it.
True.
> - currently, the text subtitles decoder are filling the rects[x]->ass
> fields not only with the ASS "payload" but as if they were /lines/ of a
> ASS file. The transformation can be summarized as the following:
>
> o "Dialogue: " is added at the beginning
> o start time and end time are added in the payload
> o field order is dropped
> o \r\n is added at the end
>
> This is problematic because these packets data can not be sent to
> libass in a sane manner: instead of using libass/ass_process_chunk()
> like you would do with a simple packet, you need to call
> libass/ass_process_data() (note that this was added to libass for that
> specific reason), to re-parse the whole line. This is by the way why
> MPlayer is doing a memcmp("Dialogue:"...) on the data packet…
>
> Now that AVPackets contains the pts and duration, I think it's wise to
> make the ASS and Matroska demuxers output these packets in a proper
> format. We will need to change a few things such as making the ASS
> muxer add the timing, and remove the hack from the Matroska muxer.
>
> I don't mind doing this, but since it will change the layout of the
> packets, it will break application expected ASS line and not ASS raw
> packets. Any opinion?
Already answered in another mail.
> I think we will need to consider the encoding/charset as well at some
> point too, but it should be doable in a nice way with the
> AVStyledSubtitles, hopefully.
Yes, clearly; and the LF/CRLF thing too. I suppose you specified that the
text in AVStyledSubtitles is in UTF-8 and does not contain newlines?
Regards,
--
Nicolas George
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20121128/a25680c0/attachment.asc>
More information about the ffmpeg-devel
mailing list