[FFmpeg-devel] dvd/vob "decoder" part 1: the system itself

Michael Niedermayer michaelni
Sun Feb 14 19:08:39 CET 2010

Hi, i hope you are still around and havnt given up yet on this ...

On Sun, Jul 12, 2009 at 02:00:33AM +0200, Erik Van Grunderbeeck wrote:
> Hello;
> Included are the files, and patches needed for the dvd decoder.

> I threw away my original system on trying to insert packets into the main
> stream (as the "attachment/data" type stream). The reason for that is that
> the dvd data stream, that contains commands for the client can change/insert
> packets "in the past". Example; a user seeks back. The DVD ifo system will
> insert packets containing "change the subtitle color" so it works ok for the
> displayed cell/chapter. LibAV doesn't expect new packets inserted
> "backwards" in the stream, and the buffer system gets all messed up because
> packet sizes are off.

please elaborate on what the problem with this is. It should have worked
from how i understand it, at least if all packets have correct timestamps

> Changing to a new system means my "hack" on decoding
> the packet header isn't needed any more, and except for stream flagging the
> mpegps  demuxer should be unchanged. My attachment decoder is not needed
> anymore (for those who looked at it, only dvd_protocol.c/h are still
> needed).

> The new system uses a callback, that the user registers. This is, after much
> experimenting, the best way I could come up with. The callback though is
> registered per protocol, and thus changes are needed in "gasp" URLContext.
> Reason for keeping it per context? Well, a user can be reading from several
> dvd devices at the same time if he/she uses the avlibs. Or some end devices
> (like the ps3) send several request to open a file at the same time. If one
> callback is registered for the whole library, this messes up the system.
> Whats send in the callback? Basically AVDVDParseContext packets. They
> contain a type (the enum effdvd_Type ) that tells the end-user app what
> should be done, and a pts timestamp in dvd time (90000/tick). The packets
> should be handled at that time.
> There are some packets which are obvious (like: new subtitle palette) and
> some that are not. The most important one is effdvd_Reset_IFO.
> effdvd_Reset_IFO asks the end-user app: clear your streams, flag all streams
> you are using as invalid (more later on that), and be prepared to re-init
> all your codec decoders. 
> Why does one need to do that? Because a vob file 1 might contain audio data
> in ac3 with 2 channels, while a vob 2 might have it in 6 channels. Or the
> resolution of the video stream might change (from dvd pal to dvd ntsc
> widescreen for example).
> First; DVD's may contain up to 64 streams. One of the patches changes the
> MAX_STREAMS constant to match that.
> Second; Stream re-init. I implemented this by adding a flag to AVStream (at
> the end, discard_flags). Set this to one in end-user app, and this will tell
> the demuxer that this stream has been discarded by user (user closed all
> codecs, etc). 
> It works by telling the demuxer that once it dedects a certain packet type,
> and normally does a av_new_stream; don't re-add the stream to the end of the
> new list (pushing more and more streams on the list). Example; suppose the
> discard_flag is set to 1 for the video stream (and we only have one). When a
> new packet is received by the demuxer, its loop will not find a valid video
> stream anymore. It thus will create a new one, but will fill the slot for
> the "old" video stream with the new video stream. Don't know if this text
> makes sense, bit a quick look at the code should be helpful.
> Why not do this in the protocol reader? Basically 1) because the protocol
> doesn't and shouldn't know about the streams and 2) because the end-user
> needs to close and free his/her decoders. Once that's done, the demuxer
> doesn't even know.
> Next; chapter seeking. I added a new flag for that: 
> #define AVSEEK_FLAG_DIRECT     8 ///< seek directly, skipping any stamps
> and translation
> #define AVSEEK_FLAG_CHAPTER     16 ///< seek to chapter, if supported
> Using that will basically use av_seek_frame() to override a search like it
> does for AVSEEK_FLAG_BYTE. The arguments timestamp will thus position
> chapters, instead of timestamps.
> Summary; whats the total change;
> 1) The whole setup adds a protocol handler called ffdvd. User inits his open
> file with av_open_input_file ("dvdread:e:/") or
> av_open_input_file("dvdread:/cdrom1/")
> 2) User registers a callback by using url_fset_protocol_callback(), a new
> function in avio.c. I tried to make it generic.
> 3) User polls streams as before for audio/video
> 4) User can get callback events from the protocol, with timestamps.
> 5) Events get examined for the timestamps on the protocol packet, and
> handled upon.
> 6) When needed, these events are executed, and for some functions a notify
> is send back (example, when user selects a button
> dvd_protocol_select_button() is called).
> 7) That's it.
> How well does this all work? I build a example app separate from ffmpeg, on
> which rather big changes are going to be needed. Note that these changes
> have almost nothing to do with the dvd protocol handler. They are mostly
> implementing the "we should look at this/this is wrong/etc" comments
> (example; discarding new streams). I have ffplay ok, but will need to work
> with someone on ffmpeg. Source is. Offcourse available.
> I am attaching the main files here, and the most important patches. Since I
> am not sure on how the continue on this, please tell me how this needs to be
> handled (ege do patches first, what order,. Etc. ) If the system is
> accepted, at least one patch needs a major version bump. I should probably
> split up some patches too.
> Note 2; I need to talk to the people for libdvdread, since I fixed a few
> problems in their code (and added one function to read attributes that its
> not available now dvdnav_get_video_attributes()). To have this working, we
> would offcourse also need to depend on libdvdread (if the user enabled it by
> ./configure, changes for makesfiles and that needed).
> Note 3: we will probably need some commandline extensions to specify seeking
> to a chapter, etc. for ffmpeg

> I have worked on and of on this for the better part of 2 months figuring out
> the best and least intrusive way. If this all gets rejected, I am fine with
> that. If it doesn't, also. Just let me know. 

I definitly want this in ffmpeg, so no reject

nitpick: we use spaces not tabs also try tools/patcheck
these of course are not truly important and it would probably be
unwise to spend much time on cleaning this up for some parts of the

Some comments
1. First the av_seek_direct() code looks good, if you could send this as
   a seperate patch (also please name it av_seek_protocol() and
   AVSEEK_FLAG_PROTOCOL) i suspcct we should be able to commit this quickly

2. AVSEEK_FLAG_CHAPTER should be ok as well if you could send this as a
   seperate patch. But note the parametr should be an absolute chapter
   number not a relative to current position or you will have to deal
   with packet fifos between demux and decode causing a delay and +-1
   errors close to chapter transitions

3. iam against stream reusal, stream numbers must be unique, we cant just
   by seeking around have chapter 5 use stream 1 as video and when we seek
   back video is stream 0 with audio stream 1. (this can happen if streams
   are marked "discard" and reused.

4. about the callback, i think this will need to be changed somewhat
   as is i see 2 issues
   A. thread sync
   B. every application would need a buffer to make sure the data from the
      callback is delayed and processed at the correct time (that is the
      time matchng what went through demux & decode with all its fifos)
    I thus think some way to pull data out like av_read_frame() seems a
    bit easier to handle.

5. If we had a data stream instead of a callback, stream copy to arbitrary
   containers might work somewhat. With a callback all callback provided
   information would be lost, i dont know if this would lead to any problem
   or not with actual DVDs being stream copied to mkv/nut/avi whatever

more comments below

> typedef struct AVAttachmentButton {
>     uint16_t x;
>     uint16_t y;
>     uint16_t w;
>     uint16_t h;
> } AVAttachmentButton;

64k should be enough for everyone ;)
i do prefer int for this and most other variables

> typedef struct AVDVDParseContext {
>     //    pts of this context
>     int64_t                pts;
>     //    type
>     uint16_t            Type;
>     //    when waiting
>     uint16_t            Wait;
>     //    when buttons
>     uint16_t            HighLightIndex;
>     uint16_t            ButtonCount;
>     AVAttachmentButton    Buttons[MAX_ATTACH_BUTTONS];
>     //    when color-palette
>     uint32_t            rgba_palette[16];

seems like job for a union not a struct

>     //    expected size of the video pictures (allows for skip of scan-stream)
>     uint16_t            video_width;
>     uint16_t            video_height;
>     //    aspect ratio (0 = 4:3 , 3 = 16:9)
>     uint8_t                aspect_ratio : 4;
>     //    video format (0 = ntsc, 1 = pal)
>     uint8_t                video_format : 4;
>     //    current vts
>     uint8_t                current_vts;
>     //    count of languages
>     uint8_t                AudioLanguage_Count;
>     uint8_t                SubTitleLanguage_Count;
>     //    languages
>     uint16_t            Audio_Language[MAX_ATTACH_AUDIO_LANGUAGE];
>     uint16_t            Audio_Flags[MAX_ATTACH_AUDIO_LANGUAGE];
>     uint8_t                Audio_Channels[MAX_ATTACH_AUDIO_LANGUAGE];
>     uint8_t                Audio_Mode[MAX_ATTACH_AUDIO_LANGUAGE];    
>     uint16_t            SubTitle_Language[MAX_ATTACH_SUB_LANGUAGE];
>     uint16_t            SubTitle_Flags[MAX_ATTACH_SUB_LANGUAGE];

alot of this looks misplaced, width, height, aspect and language have their
fields in AVStream & AVCodecContext already. So please explain why
they are here duplicated.

>     //    when angle change
>     uint8_t                current_angle;
>     uint8_t                max_angle;

>     //    size of current title in pts ticks. divide by 90000 to get time in seconds
>     uint64_t            titletime;

needs a AVRational time_base instead of 90khz and i guess duration is a
better term

>     //    flags that describe actions allowed for current chapter
>     uint32_t            flags;

> } AVDVDParseContext;

Also this name hurt my eyes, this could be usefull for MMS/RTP and who knows
what else. In that sense it also should be kept reasonably generic

> /*  *************** protos *************** */
> /*    register */
> int        dvd_protocol_register();
> /* select a button */
> void    dvd_protocol_select_button(AVFormatContext *ctx, uint32_t nIndex);
> /* signal queue empty */
> void    dvd_protocol_signal_wait(AVFormatContext *ctx, uint32_t iSkip);
> /* signal queue empty */
> void    dvd_protocol_signal_queue(AVFormatContext *ctx);
> /* reset has been processed */
> void    dvd_protocol_signal_reset(AVFormatContext *ctx);

A more generic message passing to AVProtocols seems better than one function specific
to one protocol and function each

> @@ -2493,10 +2544,10 @@
>                  av_log(s, AV_LOG_ERROR, "dimensions not set\n");
>                  return -1;
>              }
> -            if(av_cmp_q(st->sample_aspect_ratio, st->codec->sample_aspect_ratio)){
> -                av_log(s, AV_LOG_ERROR, "Aspect ratio mismatch between encoder and muxer layer\n");
> -                return -1;
> -            }
> +//            if(av_cmp_q(st->sample_aspect_ratio, st->codec->sample_aspect_ratio)){
> +//                av_log(s, AV_LOG_ERROR, "Aspect ratio mismatch between encoder and muxer layer\n");
> +//                return -1;
> +  //          }
>              break;
>          }
> @@ -2932,6 +2983,7 @@
>  }
>  #endif
> +/*    @@@ BUG BUG will fail on encoding at t 23.59.59 started + */
>  int64_t av_gettime(void)
>  {
>      struct timeval tv;

ehm ...

> Index: avformat.h
> ===================================================================
> --- avformat.h    (revision 19309)
> +++ avformat.h    (working copy)
> @@ -185,6 +185,7 @@
>  #define AVFMT_GENERIC_INDEX 0x0100 /**< Use generic index building code. */
>  #define AVFMT_TS_DISCONT    0x0200 /**< Format allows timestamp discontinuities. */
>  #define AVFMT_VARIABLE_FPS  0x0400 /**< Format allows variable fps. */
> +#define AVFMT_NOHEADER        0x0800 /**< do not try to read headers. */

what is this good for?

Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Let us carefully observe those good qualities wherein our enemies excel us
and endeavor to excel them, by avoiding what is faulty, and imitating what
is excellent in them. -- Plutarch
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20100214/ba679072/attachment.pgp>

More information about the ffmpeg-devel mailing list