[Ffmpeg-devel] [PATCH] apple caff demuxer

Wed Apr 4 08:28:56 CEST 2007

Baptiste Coudurier wrote:
> Hi
> 
> Justin Ruggles wrote:
> 
>>Hi,
>>
>>Here is a demuxer for Apple CAFF files.  It was written using
>>documentation from:
>>http://developer.apple.com/reference/MusicAudio/idxCoreAudio-date.html
>>
>>I basically wrote it because I needed to learn more about
>>libavformat...and I think CAFF has a fairly good design.
>>
>>The demuxer reuses uses a function from mov.c, which I moved to isom.c/h.
>>
>>Currently I only have samples for pcm and aac, so those are the only
>>codecs which are enabled.  I need to have samples because the
>>documentation only covers the 'kuki' chunk layout for AAC.  The layout
>>of that chunk is needed for each codec in order to get the proper
>>extradata.  I imagine many of the other codecs have a similar format
>>(mp4 esds atom), but I don't want to assume anything at this point.
>>
>>I can upload samples to mplayerhq if/when I get permission from the
>>person who sent them to me.  It would also be great if anyone else has
>>samples or can generate samples containing other codecs.
>>
>>I'll eventually write a muxer as well, hence the splitting of the
>>demuxer into caff.c and caffdec.c.
>>
>>+/*
>>+known tags which are either not supported in FFmpeg or we do not have
>>+samples in order to know what the 'kiki' chunk format is in relation to
>>+the required extradata format in the decoder.
> 
> 
> Why not adding them in caff_tags and commenting them out ?

fixed.  I also enabled more codecs.  Ones that don't need extradata
should work just fine.  For ones that do, for now I'm just having it
copy all the kuki chunk data straight to extradata.  If we run across
samples which don't work, it can always be changed later on a
codec-by-codec basis.  But I've disabled dv audio until I figure out
exactly how it should be done...and until there is a sample to test with.

> 
>>+
>>+void ff_caff_get_codec_id(CaffContext *ctx)
>>+{
>>+    ctx->codec_id = CODEC_ID_NONE;
>>+
>>+    /* codec selection for lpcm is chosen using format description and flags */
>>+    if(ctx->format_id == MKBETAG('l','p','c','m')) {
>>+        /* unpacked 24-bit is not currently supported */
>>+        if((ctx->bits_per_channel == 24) &&
>>+           (ctx->bytes_per_packet == (ctx->channels_per_frame * 4))) {
>>+            return;
>>+        }
>>+        /* floating-point lpcm is not currently supported */
>>+        if(ctx->format_flags & CAFF_LPCM_FLAGS_IS_FLOAT) {
>>+            return;
>>+        }
>>+        if(ctx->bits_per_channel == 8) {
>>+            ctx->codec_id = CODEC_ID_PCM_S8;
>>+        } else if(ctx->bits_per_channel == 16) {
>>+            if(ctx->format_flags & CAFF_LPCM_FLAGS_IS_LITTLEENDIAN) {
>>+                ctx->codec_id = CODEC_ID_PCM_S16LE;
>>+            } else {
>>+                ctx->codec_id = CODEC_ID_PCM_S16BE;
>>+            }
>>+        } else if(ctx->bits_per_channel == 24) {
>>+            if(ctx->format_flags & CAFF_LPCM_FLAGS_IS_LITTLEENDIAN) {
>>+                ctx->codec_id = CODEC_ID_PCM_S24LE;
>>+            } else {
>>+                ctx->codec_id = CODEC_ID_PCM_S24BE;
>>+            }
>>+        } else if(ctx->bits_per_channel == 32) {
>>+            if(ctx->format_flags & CAFF_LPCM_FLAGS_IS_LITTLEENDIAN) {
>>+                ctx->codec_id = CODEC_ID_PCM_S32LE;
>>+            } else {
>>+                ctx->codec_id = CODEC_ID_PCM_S32BE;
>>+            }
> 
> 
> that mess begins to be duplicated in every demuxer, maybe
> CODEC_ID_RAWAUDIO should be added, how to handle the BE/LE case ?

maybe CODEC_ID_PCM_BE / CODEC_ID_PCM_LE.  and actually use the
SampleFormat in AVCodecContext, which would make it easy to implement
floating-point PCM decoding.

> 
>>[...]
>>+
>>+typedef struct {
>>+    int codec_id;                   ///< libavcodec CODEC_ID_*
>>+
>>+    double sample_rate;             ///< sampling frequency
>>+    int format_id;                  ///< 4-byte codec tag
> 
> 
> why duplicating them from codec ?

fixed.

> 
>>[...]
>>+    int channels_per_frame;         ///< number of channels
>>+    int bits_per_channel;           ///< sample bit depth, e.g. 16-bit audio
> 
> 
> same

fixed.

> 
>>[...]
>>+
>>+    if(ctx->codec_id == CODEC_ID_AAC) {
>>+        /* The magic cookie format for AAC is an mp4 esds atom.
>>+           The lavc aac decoder requires the data from the codec specific
>>+           description as extradata input. */
>>+
>>+        int tag, strt, len, cid, obj;
>>+
>>+        strt = url_ftell(pb);
>>+        tag = get_byte(pb);
>>+        len = ff_mov_mp4_read_descr_len(pb);
>>+        if(tag == 3)
>>+            url_fskip(pb, 3);
>>+        else
>>+            url_fskip(pb, 2);
>>+        if((url_ftell(pb) - strt) < size) {
>>+            tag = get_byte(pb);
>>+            len = ff_mov_mp4_read_descr_len(pb);
>>+            if(tag == 4) {
>>+                obj = get_byte(pb);
>>+                url_fskip(pb, 12);
>>+                cid = codec_get_id(ff_mp4_obj_type, obj);
>>+                if(cid != ctx->codec_id) {
>>+                    av_log(s, AV_LOG_WARNING, "MP4 object type mismatch\n");
>>+                }
>>+                while((url_ftell(pb) - strt) < size) {
>>+                    tag = get_byte(pb);
>>+                    len = ff_mov_mp4_read_descr_len(pb);
>>+                    if(tag == 5) {
>>+                        codec->extradata = av_mallocz(len + FF_INPUT_BUFFER_PADDING_SIZE);
>>+                        if(codec->extradata) {
>>+                            get_buffer(pb, codec->extradata, len);
>>+                            codec->extradata_size = len;
>>+                        }
>>+                    } else {
>>+                        url_fskip(pb, len);
>>+                    }
>>+                }
>>+            }
>>+        }
> 
> 
> That code pretty much looks like mov_read_esds, move it to isom.c, adapt
> it if needed.

fixed.  I moved mov_read_esds and the MOV_esds_t struct to isom.c/.h.

> 
>>[...]
>>+    ctx->num_packets = get_be64(pb);
>>+    ctx->packet_table = av_mallocz(ctx->num_packets *
>>+                                   sizeof(PacketTableEntry));
> 
> 
> check for overflow before mul.

fixed...i think.

> 
>>[...]
>>+
>>+    memset(ctx, 0, sizeof(CaffContext));
> 
> 
> useless.

fixed.

> 
>>+    /* audio description chunk */
>>+    if(get_be32(pb) != MKBETAG('d','e','s','c')) {
>>+        av_log(s, AV_LOG_ERROR, "desc chunk not present\n");
>>+        return AVERROR_INVALIDDATA;
>>+    }
>>+    size = get_be64(pb);
>>+    ret = caff_read_desc(s, size);
>>+    if(ret)
>>+        return ret;
>>+    st = s->streams[0];
>>+    codec = st->codec;
>>+
>>+    /* parse each chunk */
>>+    found_data = 0;
>>+    while(!url_feof(pb)) {
>>+
>>+        /* stop at data chunk if seeking is not supported or
>>+           data chunk size is unknown */
>>+        if(found_data && (ctx->data_size < 0 || !pb->seek))
>>+            break;
> 
> 
> why not url_is_streamed() instead of pb->seek ?

fixed.

> 
>>[...]
>>+                break;
>>+
>>+            /* TODO: other documented chunks */
>>+            case MKBETAG('c','h','a','n'):
>>+            case MKBETAG('s','t','r','g'):
>>+            case MKBETAG('m','a','r','k'):
>>+            case MKBETAG('r','e','g','n'):
>>+            case MKBETAG('i','n','s','t'):
>>+            case MKBETAG('m','i','d','i'):
>>+            case MKBETAG('o','v','v','w'):
>>+            case MKBETAG('e','d','c','t'):
>>+            case MKBETAG('i','n','f','o'):
>>+            case MKBETAG('u','m','i','d'):
>>+            case MKBETAG('u','u','i','d'):
>>+            case MKBETAG('f','r','e','e'):
>>+                if(size < 0)
>>+                    return AVERROR_INVALIDDATA;
>>+                url_fskip(pb, size);
>>+                break;
> 
> 
> IMHO don't mention those useless atom for now, and add them when there
> are parsed/needed.

ok.  I actually did it because CAFF defines a strict list of approved
chunks.  Any extensions are supposed to use the uuid chunk.  But for now
I changed it to give the same generic warning for unsupported or
unimplemented chunks.

> 
>>[...]
>>+
>>+    av_set_pts_info(st, 64, 1, st->codec->sample_rate);
>>+    st->start_time = 0;
>>+    st->duration = st->nb_frames;
> 
> 
> wrong, duration is in stream timebase according to demuxers and utils.c,
> btw doxy is also wrong in avformat.h.

But 1/sample_rate is the stream timebase in this case, so 1 frame is
equivalent to 1 timebase unit.

> 
>>+    /* position the stream at the start of data */
>>+    if(ctx->data_size >= 0 && pb->seek)
>>+        url_fseek(pb, ctx->data_start, SEEK_SET);
> 
> 
> same url_is_streamed.

fixed.

> 
>>+    return 0;
>>+}
>>+
>>+#define MAX_SIZE 4096
>>+
>>+static int caff_read_packet(AVFormatContext *s, AVPacket *pkt)
>>+{
>>+    ByteIOContext *pb = &s->pb;
>>+    CaffContext *ctx = (CaffContext *)s->priv_data;
> 
> 
> useless cast and there are more before also, I missed them.

fixed

> 
>>[...]
>>+
>>+    pkt->size = res;
>>+    pkt->stream_index = 0;
> 
> 
> I missed it but caff only supports one track ?

yes, only one track.

> 
>>+    ctx->packet_cnt += packets_read;
>>+    ctx->frame_cnt += pkt_frames * packets_read;
>>+
>>+    return 0;
>>+}
>>+
>>+static int caff_read_close(AVFormatContext *s)
>>+{
>>+    if(s && s->priv_data) {
>>+        CaffContext *ctx = (CaffContext *)s->priv_data;
>>+        if(ctx->packet_table) {
>>+            av_freep(&ctx->packet_table);
>>+        }
>>+        av_freep(&ctx);
>>+    }
>>+    return 0;
>>+}
> 
> 
> useless check, and priv_data is freed in utils.c, when
> av_close_input_file is called.

fixed.

>>+
>>+static int caff_read_seek(AVFormatContext *s, int stream_index,
>>+                          int64_t timestamp, int flags)
>>+{
>>+    CaffContext *ctx = s->priv_data;
>>+    int64_t bpos;
>>+
>>+    /* FIXME: simplify */
>>+    if(ctx->frames_per_packet > 0 && ctx->bytes_per_packet > 0) {
>>+        ctx->frame_cnt = FFMAX(ctx->frame_cnt+timestamp, 0);
>>+        bpos = ctx->bytes_per_packet * ctx->frame_cnt / ctx->frames_per_packet;
>>+        bpos = FFMIN(bpos, ctx->data_size);
>>+        ctx->packet_cnt = bpos / ctx->bytes_per_packet;
>>+        ctx->frame_cnt = ctx->frames_per_packet * ctx->packet_cnt;
>>+        url_fseek(&s->pb, bpos + ctx->data_start, SEEK_SET);
>>+    } else if(ctx->has_packet_table) {
>>+        ctx->frame_cnt = FFMAX(ctx->frame_cnt+timestamp, 0);
>>+        ctx->packet_cnt = ctx->frame_cnt / ctx->frames_per_packet;
>>+        ctx->frame_cnt = ctx->packet_cnt * ctx->frames_per_packet;
>>+        ctx->packet_cnt = FFMIN(ctx->packet_cnt, ctx->num_packets-1);
>>+        bpos = ctx->packet_table[ctx->packet_cnt].bpos;
>>+        ctx->frame_cnt = ctx->packet_table[ctx->packet_cnt].fpos;
>>+        bpos = FFMIN(bpos, ctx->data_size);
>>+        url_fseek(&s->pb, bpos + ctx->data_start, SEEK_SET);
>>+        return 0;
>>+    } else {
> 
> 
> I need more time to review that.

I changed this function a bit and made it more readable as well.

Thanks,
Justin

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: caff_demuxer_v2.diff
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20070404/252aed20/attachment.txt>