[FFmpeg-devel] libavfilter and subtitles (was: [RFC] Talk about subtitles)

Nicolas George nicolas.george at normalesup.org
Thu Dec 22 14:32:09 CET 2011

Le septidi 27 frimaire, an CCXX, Clément Bœsch a écrit :
> First, writing the smovie source filter means adding the subtitles in the
> filterchain.

There are big issues that will arise with subtitles filtering if the
architecture is kept the way it is.

The problem I see is that subtitles, unlike audio and video, are sparse.
Thus, with formats that mux the subtitles together with the video, accessing
the next subtitle frame, as the result of request_frame, could trigger
several megaoctets' worth of reads on the source file. For transcoding
applications, this is a small drawback; for live playback this is a big no.

Solving this issue in a satisfactory manner seems like a lot of work, but I
believe in the long term it would be worth it.

Filters with both continuous and sparse outputs should ignore request_frame
and send on the sparse output when data is available. Filters with sparse
inputs should not think that request_frame not triggering anything means the
end of the stream.

But for that to work, it is necessary that video and subtitles are demuxed
together. If the filterchain is something like this:

	movie=input.vob [v]; smovie=input.mkv [s]; [v] [s] hardsub

then smovie has only the sparse output, it can not keep it in sync with the
video stream.

We could have the movie source detect that it is opening the same file twice
and unify them, but I find it is an ugly hack and believe it should be

IMHO, the real, clean, solution, would be to device a way for the movie
source to be able to output both the video and the subtitles if necessary.
And, while we are at it, the audio too.

Here is what I suggest:

- In the structure describing a filter, an input or output pad can be
  flagged as an "array".

- If an input is an array, it must be connected with the following syntax:

- If an output is an array, when referencing it to connect it to an input,
  an index must be supplied: [ref.5]; the indices can be sparse (ref.2 can
  be used even if ref.1 is not), but not too much so (a small array is
  suitable to store the references).

- Input and output pads arrays are created with exactly the pads present in
  the graph description. Thus '[a,b] [c] some_filter' will have a as
  input[0], b as input[1] and c as input[2]. A new field in the filter
  structure allows the init function to find the exact mapping.

This way, we solve several problems at once:

- The movie source has three outputs: video, audio, subtitles, all of them
  arrays; the index in the array is the index of the stream. If the user
  wants one of the streams, it connects something to it.

- The amerge filter I am working on can have one array input instead of two
  normal ones.

What do you think about it?


  Nicolas George
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20111222/4d2e2127/attachment.asc>

More information about the ffmpeg-devel mailing list