[FFmpeg-devel] [RFC] Subtitles: seek, rectime, residual, new API

Tue Jul 31 21:10:40 CEST 2012

Le quartidi 14 thermidor, an CCXX, Clément Bœsch a écrit :
> Hi,
> 
> Here I am again messing with subtitles, this time with several specific
> questions.
> 
> I've noticed that extracting a sample of a video (using -ss and -t) with
> soft subtitles (even with standalone subtitles) always leads to broken
> outputs, for various reasons. First, it seems the check recording time is
> basically broken in case of subtitles, but this first quick fix seems to
> do the trick:
> 
> diff --git a/ffmpeg.c b/ffmpeg.c
> index c2ea5bd..c6d1829 100644
> --- a/ffmpeg.c
> +++ b/ffmpeg.c
> @@ -1684,13 +1684,14 @@ static void do_subtitle_out(AVFormatContext *s,
>          nb = 1;
>  
>      for (i = 0; i < nb; i++) {
> -        ost->sync_opts = av_rescale_q(pts, ist->st->time_base, enc->time_base);
> +        ost->sync_opts = pts;
>          if (!check_recording_time(ost))
>              return;

It does not look right: check_recording_time() assumes ost->sync_opts is
expressed in ost->st->codec->time_base. Something else must be wrong
somewhere.

> Then, one particularly annoying issue is that the timestamps are not
> updated: the events are well extracted but the original timestamps are
> kept verbatim (so it's completely out of sync with the output), for two
> reasons:
> 
>  1) the starting time is not used to alter the pts (it is done for audio &
>     video when dealing with filters, but that's not the case for
>     subtitles), but this can be easily fixed with this:
> 
>          sub->pts = av_rescale_q(pts, ist->st->time_base, AV_TIME_BASE_Q);
>          // start_display_time is required to be 0
>          sub->pts               += av_rescale_q(sub->start_display_time, (AVRational){ 1, 1000 }, AV_T
> +        sub->pts -= output_files[ost->file_index]->start_time;
>          sub->end_display_time  -= sub->start_display_time;
>          sub->start_display_time = 0;
>          subtitle_out_size = avcodec_encode_subtitle(enc, subtitle_out,

Looks right. Maybe add a comment: for audio and video, it is done just when
the frame is extracted from buffersink; it would not do to subtract it twice
when lavfi becomes able to handle subtitles.

> 
>      It might need some rescale, not sure.

Some day, some one will add doxy on all timestamps fields in all structures
in ffmpeg and that will be a great help for everyone.

> 
>   2) some subtitles encoder are *not* using that PTS: for example the
>      SubRip encoder is parsing the ASS encoded rect timestamps. This can
>      be changed easily doing something like this:
> 
> diff --git a/libavcodec/srtenc.c b/libavcodec/srtenc.c
> index 56d3397..40c254d 100644
> --- a/libavcodec/srtenc.c
> +++ b/libavcodec/srtenc.c
> @@ -252,8 +252,8 @@ static int srt_encode_frame(AVCodecContext *avctx,
>  
>          dialog = ff_ass_split_dialog(s->ass_ctx, sub->rects[i]->ass, 0, &num);
>          for (; dialog && num--; dialog++) {
> -            int sh, sm, ss, sc = 10 * dialog->start;
> -            int eh, em, es, ec = 10 * dialog->end;
> +            int sh, sm, ss, sc = sub->pts / 1000;
> +            int eh, em, es, ec = sc + (dialog->end - dialog->start) * 10;
>              sh = sc/3600000;  sc -= 3600000*sh;
>              sm = sc/  60000;  sc -=   60000*sm;
>              ss = sc/   1000;  sc -=    1000*ss;
> 
>     It might need some rescale as well, but it seems to do the trick.
> 
> Does anyone has any opinion on making the encoders use the AVSubtitles PTS
> instead just like these patches?

Yes, yes, yes!

Note that there is a problem with Matroska, which merges the ASS packets
that start at the same time, and thus make it impossible to have a proper
duration on the packet.

> Now something else, still related to the seek. Would it make sense to make
> check_output_constraints() actually check for pts+duration instead (at
> least for subtitles)? This way, if we have a subtitle at pts=10 with
> duration=18 and we are seeking at pts=15, it will be remuxed/re-encoded
> with pts=15 and duration=18-15+10=13 and so we wouldn't lose it.

It would be nice, but I do not think it is very important or urgent. And it
will not work for people who use "-ss time -i input -ss 0 output" for fast
and accurate seeking if there is a seek point between the subtitle packet
and the target time.

> Another thing: if we are willing to make a new subtitles (using AVPacket
> instead of buffers for encode), would it make sense to move subtitles to
> AVFrame, or using AVSubtitles is fine?

I am not sure. I gave it some thought a few days ago: AVFrame is very not
suited for subtitles, but the additional information that AVFrame carries
around is very much useful. Also, AVSubtitle is currently user-allocated, it
is a nightmare for compatibility.

> Last question: is there any reason the mapping extension->format->codec
> works with audio and video but not subtitles? (example: ffmpeg -i in.ass
> out.srt doesn't work and require -c:s srt).

At first glance, because ff_srt_muxer.subtitle_codec = CODEC_ID_TEXT and
there is no encoder for CODEC_ID_TEXT.

Regards,

-- 
  Nicolas George
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20120731/0c50596c/attachment.asc>