[FFmpeg-devel] [PATCH] speex in ogg muxer

Justin Ruggles justin.ruggles
Sat Sep 5 17:50:47 CEST 2009


Justin Ruggles wrote:

> Justin Ruggles wrote:
> 
>> Baptiste Coudurier wrote:
>>> Justin Ruggles wrote:
>>>> Baptiste Coudurier wrote:
>>>>> Hi Justin,
>>>>>
>>>>> Justin Ruggles wrote:
>>>>>> Hi,
>>>>>>
>>>>>> This patch adds speex support to the ogg muxer.  It basically does the
>>>>>> same thing as Ogg/FLAC, in that the 1st packet is a global header from
>>>>>> extradata and the 2nd packet is vorbiscomment metadata.
>>>>>>
>>>>>> This seems to work just fine for speex-to-speex stream copy, but
>>>>>> probably would not work for flv-to-speex because flv doesn't to have any
>>>>>> speex extradata from what I can tell.  I guess a header could be
>>>>>> constructed, but that would be a separate patch to the flv demuxer.
>>>>>>
>>>>>> This patch is a precursor to libspeex encoding support, which I'll be
>>>>>> sending shortly.
>>>>>>
>>>>>> -Justin
>>>>>>
>>>>>>
>>>>>>
>>>>>> ------------------------------------------------------------------------
>>>>>>
>>>>>> Index: libavformat/oggenc.c
>>>>>> ===================================================================
>>>>>> --- libavformat/oggenc.c	(revision 19244)
>>>>>> +++ libavformat/oggenc.c	(working copy)
>>>>>> @@ -104,17 +125,39 @@
>>>>>>      bytestream_put_byte(&p, 0x00); // streaminfo
>>>>>>      bytestream_put_be24(&p, 34);
>>>>>>      bytestream_put_buffer(&p, streaminfo, FLAC_STREAMINFO_SIZE);
>>>>>> -    oggstream->header_len[1] = 1+3+4+strlen(vendor)+4;
>>>>>> -    oggstream->header[1] = av_mallocz(oggstream->header_len[1]);
>>>>>> -    p = oggstream->header[1];
>>>>>> +    p = ogg_write_vorbiscomment(4, bitexact, &oggstream->header_len[1]);
>>>>>> +    if (!p)
>>>>>> +        return -1;
>>>>> AVERROR(ENOMEM)
>>>> fixed.
>>>>
>>>>>> @@ -144,6 +188,12 @@
>>>>>>                  av_log(s, AV_LOG_ERROR, "Extradata corrupted\n");
>>>>>>                  av_freep(&st->priv_data);
>>>>>>              }
>>>>>> +        } else if (st->codec->codec_id == CODEC_ID_SPEEX) {
>>>>>> +            if (ogg_build_speex_headers(st->codec, oggstream,
>>>>>> +                                        st->codec->flags & CODEC_FLAG_BITEXACT) < 0) {
>>>>>> +                av_log(s, AV_LOG_ERROR, "error writing Speex headers\n");
>>>>>> +                av_freep(&st->priv_data);
>>>>>> +            }
>>>>> return error here with the return code of the func :>
>>>>> Yes, it seems flac miss it too, this needs a fix.
>>>>>
>>>>> patch fine otherwise, maybe a micro bump for avformat would be nice.
>>>> fixed. new patch attached. the new patch also differs in that it
>>>> overrides the extra_headers field in the Speex header to be 0 since only
>>>> the 2 required headers are written.
>>>>
>>> patch ok if it works :>
> 
> Ok, back to square one.
> 
>> Hmm... I've done several more tests and it does not quite work as-is for
>> all samples.  Here is what I have run into.  The tests so far are for
>> ogg-to-ogg stream copy.
>>
>> - When the source has more than 1 frame per packet, the resulting copy
>> plays fine with ffmpeg/ffplay but is quick and choppy with speexdec.  I
>> was able to fix this by modifying the ogg/speex demuxer to set
>> avctx->frame_size to the number of samples in a packet instead of in a
>> frame.  I also had to update the libspeex decoder accordingly.  Maybe
>> this is the wrong way to go about it though.  I'm guessing it is a
>> timestamp/granulepos issue, but I don't know enough about Ogg to tell
>> more than that.
> 
> This is now corrected after much discussion. :)
> 
>> - Even with the fix and even with 1 frame per packet, 2 short samples
>> I've tested so far have a single soft pop when the stream-copied file is
>> decoded with speexdec, but it's fine with ffmpeg/ffplay.
>>
>> Maybe someone else might have an idea of what could be going wrong?
> 
> Now I think I know what is going wrong, and there is nothing we can do
> about it I think.  speexenc does some weird things with granule
> positions.  It starts out for a long time with granulepos=0 even though
> it is encoding audio, then when it starts writing granule positions it
> is not always in sync with the start of the stream.  Below is a little
> snippet from a comparison of an original spx file to a copied spx file.
>  Each packet should be 320 samples.
> 
> [...]
> 
> -00:00:00.000: serialno 1626088319, calc. gpos 0, packetno 57
> +00:00:01.120: serialno 0000000000, granulepos 17920, packetno 57
> 
> -00:00:00.000: serialno 1626088319, calc. gpos 0, packetno 58
> +00:00:01.140: serialno 0000000000, granulepos 18240, packetno 58
> 
> -00:00:00.000: serialno 1626088319, calc. gpos 0, packetno 59
> +00:00:01.160: serialno 0000000000, granulepos 18560, packetno 59
> 
> -00:00:01.171: serialno 1626088319, granulepos 18737, packetno 60
> +00:00:01.180: serialno 0000000000, granulepos 18880, packetno 60
> 
> -00:00:01.191: serialno 1626088319, calc. gpos 19057, packetno 61
> +00:00:01.191: serialno 0000000000, granulepos 19057, packetno 61
> 
> -00:00:01.211: serialno 1626088319, calc. gpos 19377, packetno 62
> +00:00:01.211: serialno 0000000000, granulepos 19377, packetno 62

So... I figured it out, but you may not want to know the answer. ;)

The granulepos of the first packet is supposed to be interpreted as
smaller than the full frame size by calculating what the granulepos of
the first page would normally be, then subtracting it from what it
really is to get the delay.




More information about the ffmpeg-devel mailing list