[Ffmpeg-devel] [PATCH] apple caff demuxer

Wed Apr 4 17:18:02 CEST 2007

Baptiste Coudurier wrote:
> Hi
> 
> Justin Ruggles wrote:
> 
>>>>[...]
>>>>+        if(ctx->bits_per_channel == 8) {
>>>>+            ctx->codec_id = CODEC_ID_PCM_S8;
>>>>+        } else if(ctx->bits_per_channel == 16) {
>>>>+            if(ctx->format_flags & CAFF_LPCM_FLAGS_IS_LITTLEENDIAN) {
>>>>+                ctx->codec_id = CODEC_ID_PCM_S16LE;
>>>>+            } else {
>>>>+                ctx->codec_id = CODEC_ID_PCM_S16BE;
>>>>+            }
>>>>+        } else if(ctx->bits_per_channel == 24) {
>>>>+            if(ctx->format_flags & CAFF_LPCM_FLAGS_IS_LITTLEENDIAN) {
>>>>+                ctx->codec_id = CODEC_ID_PCM_S24LE;
>>>>+            } else {
>>>>+                ctx->codec_id = CODEC_ID_PCM_S24BE;
>>>>+            }
>>>>+        } else if(ctx->bits_per_channel == 32) {
>>>>+            if(ctx->format_flags & CAFF_LPCM_FLAGS_IS_LITTLEENDIAN) {
>>>>+                ctx->codec_id = CODEC_ID_PCM_S32LE;
>>>>+            } else {
>>>>+                ctx->codec_id = CODEC_ID_PCM_S32BE;
>>>>+            }
>>>
>>>that mess begins to be duplicated in every demuxer, maybe
>>>CODEC_ID_RAWAUDIO should be added, how to handle the BE/LE case ?
>>
>>maybe CODEC_ID_PCM_BE / CODEC_ID_PCM_LE.  and actually use the
>>SampleFormat in AVCodecContext, which would make it easy to implement
>>floating-point PCM decoding.
> 
> 
> CODEC_ID_LPCM_BE/LE, ok with that, any other opinion ?
> 
> 
>>>>[...]
>>>>+
>>>>+    av_set_pts_info(st, 64, 1, st->codec->sample_rate);
>>>>+    st->start_time = 0;
>>>>+    st->duration = st->nb_frames;
>>>
>>>wrong, duration is in stream timebase according to demuxers and utils.c,
>>>btw doxy is also wrong in avformat.h.
>>
>>But 1/sample_rate is the stream timebase in this case, so 1 frame is
>>equivalent to 1 timebase unit.
> 
> 
> no, 1 frame is long as how many samples are in your frame, being said,
> one ac3 frame contains 1536 samples at 48khz, duration of one frame is
> 1536/48000, and assuming you have 50 frames, you have 76800 samples, and
> duration is 76800 / 48000, not 50 / 48000.

I think we're having a terminology mixup here.  The CAFF specification
uses the term "frame" to refer to 1 sample over all channels.  WAVE
calls this a block.  The frame you're referring to is at the codec
level, and is called a "packet" by CAFF.  Multiple CAFF packets can be
put in one AVPacket.

> 
>>[...]
>>Index: libavformat/Makefile
>>===================================================================
>>--- libavformat/Makefile	(revision 8622)
>>+++ libavformat/Makefile	(working copy)
>>@@ -28,6 +28,7 @@
>> OBJS-$(CONFIG_AVI_MUXER)                 += avienc.o riff.o
>> OBJS-$(CONFIG_AVISYNTH)                  += avisynth.o
>> OBJS-$(CONFIG_AVS_DEMUXER)               += avs.o vocdec.o voc.o riff.o
>>+OBJS-$(CONFIG_CAFF_DEMUXER)              += caffdec.o caff.o riff.o
> 
> 
> need isom.o also.

I keep forgetting that one...

> 
>>[...]
>>-    }
>>-    return 0;
>>+    return ff_mov_read_esds(c->fc, &sc->esds);
>> }
> 
> 
> ff_mov_read_esds(c->fc, &sc->esds);
> return 0;
> 
> maybe ff_mov_read_esds should only take ByteIOContext.
> 
> 
>>[...]
>>
>>+static int caff_read_kuki(AVFormatContext *s, int64_t size)
>>+{
>>+    ByteIOContext *pb = &s->pb;
>>+    CaffContext *ctx = s->priv_data;
>>+
>>+    if(size < 0)
>>+        return -1;
>>+
>>+    if(ctx->avctx->codec_id == CODEC_ID_AAC) {
>>+        /* The magic cookie format for AAC is an mp4 esds atom.
>>+           The lavc aac decoder requires the data from the codec specific
>>+           description as extradata input. */
>>+        MOV_esds_t esds;
>>+        int strt, skip;
>>+
>>+        strt = url_ftell(pb);
>>+        ff_mov_read_esds(s, &esds);
> 
> 
> IMHO size should be passed to read_esds and it will skip data.

Does this look ok?

ff_mov_read_esds(ByteIOContext *pb, int size, MOV_esds_t *esds)

Then esds->decoder_cfg and esds->decoder_cfg_len would be used instead
of codec extradata and extradata_size.

> 
>>+        skip = size - (url_ftell(pb) - strt);
>>+        if(skip < 0 || !ctx->avctx->extradata ||
>>+                ctx->avctx->codec_id != CODEC_ID_AAC) {
>>+            av_log(s, AV_LOG_ERROR, "invalid AAC magic cookie\n");
>>+            return AVERROR_INVALIDDATA;
>>+        }
>>+        url_fskip(pb, skip);
>>+    } else {
>>+        /* read all kuki chunk data into extradata */
>>+        ctx->avctx->extradata = av_mallocz(size + FF_INPUT_BUFFER_PADDING_SIZE);
>>+        if(ctx->avctx->extradata) {
>>+            get_buffer(pb, ctx->avctx->extradata, size);
>>+            ctx->avctx->extradata_size = size;
> 
> 
> add a check like in mov_read_extradata to avoid overflow.
> 
> if((uint64_t)size > (1<<30))
>    return -1;

Will do.

> 
>>[...]
>>+        if(ctx->bytes_per_packet == 0) {
>>+            ctx->packet_table[i].bytes = ff_mov_mp4_read_descr_len(pb);
>>+        } else {
>>+            ctx->packet_table[i].bytes = ctx->bytes_per_packet;
>>+        }
>>+        if(ctx->frames_per_packet == 0) {
>>+            ctx->packet_table[i].frames = ff_mov_mp4_read_descr_len(pb);
>>+        } else {
>>+            ctx->packet_table[i].frames = ctx->frames_per_packet;
>>+        }
> 
> 
> maybe ternary or without { would be easier to read but that's up to you.

I actually did try this several ways, including the one Michael
suggested.  I might go back to doing it that way.

> 
>>[...]
>>+
>>+            default:
>>+                av_log(s, AV_LOG_WARNING, "skipping CAFF chunk: %08X\n", tag);
> 
> 
> DEBUG no ?

Well, for invalid chunks it should be a warning, but for unimplemented
chunks it could go either way.  I figured that since skipping a valid
chunk may very well leave out an intended functionality (channel layout
for example), it wouldn't hurt to give a warning here as well.  But I
have no strong feelings on this.  If you want DEBUG I'll switch it.

> 
>>[...]
>>+static int caff_read_seek(AVFormatContext *s, int stream_index,
>>+                          int64_t timestamp, int flags)
>>+{
>>+    CaffContext *ctx = s->priv_data;
>>+    int64_t fpos=0, bpos=0;
>>+
>>+    fpos = FFMAX(0, ctx->frame_cnt + timestamp);
> 
> 
> timestamp will be in stream timebase. Are you sure that this works ?

See above discussion regarding timebase and frames vs. samples.  I did
lots of seeking tests using ffplay.

> 
>>+    if(ctx->frames_per_packet > 0 && ctx->bytes_per_packet > 0) {
>>+        /* calculate new byte position based on target frame position */
>>+        bpos = ctx->bytes_per_packet * fpos / ctx->frames_per_packet;
>>+        bpos = FFMIN(bpos, ctx->data_size);
>>+        ctx->packet_cnt = bpos / ctx->bytes_per_packet;
>>+        ctx->frame_cnt = ctx->frames_per_packet * ctx->packet_cnt;
>>+    } else if(ctx->has_packet_table) {
>>+        /* scan through packet table for target frame position */
>>+        int64_t p = ctx->packet_cnt;
>>+        if(timestamp < 0) {
>>+            while(ctx->packet_table[p].fpos > fpos && p > 0)
>>+                p--;
> 
> 
> Negative timestamp ?

Yes.  Try adding a print statement and do some seeking in ffplay. :)

> 
>>+        } else {
>>+            while(ctx->packet_table[p].fpos < fpos && p < ctx->num_packets)
>>+                p++;
>>+            p = FFMAX(p-1, 0);
>>+        }
>>+        ctx->packet_cnt = p;
>>+        ctx->frame_cnt = ctx->packet_table[p].fpos;
>>+        bpos = ctx->packet_table[p].bpos;
> 
> 
> maybe it is simpler to use AVIndex when parsing header ?

I did consider that.  But it's not necessarily simpler.  I'll give it a
try though and let you decide which is better.

> libavformat version bump is missing also.

ok. Just so I'm clear about this, what are the criteria for a
libavformat version bump?

Thanks,
Justin