[FFmpeg-devel] [PATCH] speex in ogg muxer

Justin Ruggles justin.ruggles
Sat Sep 12 17:56:27 CEST 2009


Justin Ruggles wrote:

> Justin Ruggles wrote:
> 
>> Justin Ruggles wrote:
>>
>>> Justin Ruggles wrote:
>>>
>>>> Justin Ruggles wrote:
>>>>
>>>>> Justin Ruggles wrote:
>>>>>
>>>>>> Baptiste Coudurier wrote:
>>>>>>> Justin Ruggles wrote:
>>>>>>>> Baptiste Coudurier wrote:
>>>>>>>>> Hi Justin,
>>>>>>>>>
>>>>>>>>> Justin Ruggles wrote:
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> This patch adds speex support to the ogg muxer.  It basically does the
>>>>>>>>>> same thing as Ogg/FLAC, in that the 1st packet is a global header from
>>>>>>>>>> extradata and the 2nd packet is vorbiscomment metadata.
>>>>>>>>>>
>>>>>>>>>> This seems to work just fine for speex-to-speex stream copy, but
>>>>>>>>>> probably would not work for flv-to-speex because flv doesn't to have any
>>>>>>>>>> speex extradata from what I can tell.  I guess a header could be
>>>>>>>>>> constructed, but that would be a separate patch to the flv demuxer.
>>>>>>>>>>
>>>>>>>>>> This patch is a precursor to libspeex encoding support, which I'll be
>>>>>>>>>> sending shortly.
>>>>>>>>>>
>>>>>>>>>> -Justin
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> ------------------------------------------------------------------------
>>>>>>>>>>
>>>>>>>>>> Index: libavformat/oggenc.c
>>>>>>>>>> ===================================================================
>>>>>>>>>> --- libavformat/oggenc.c	(revision 19244)
>>>>>>>>>> +++ libavformat/oggenc.c	(working copy)
>>>>>>>>>> @@ -104,17 +125,39 @@
>>>>>>>>>>      bytestream_put_byte(&p, 0x00); // streaminfo
>>>>>>>>>>      bytestream_put_be24(&p, 34);
>>>>>>>>>>      bytestream_put_buffer(&p, streaminfo, FLAC_STREAMINFO_SIZE);
>>>>>>>>>> -    oggstream->header_len[1] = 1+3+4+strlen(vendor)+4;
>>>>>>>>>> -    oggstream->header[1] = av_mallocz(oggstream->header_len[1]);
>>>>>>>>>> -    p = oggstream->header[1];
>>>>>>>>>> +    p = ogg_write_vorbiscomment(4, bitexact, &oggstream->header_len[1]);
>>>>>>>>>> +    if (!p)
>>>>>>>>>> +        return -1;
>>>>>>>>> AVERROR(ENOMEM)
>>>>>>>> fixed.
>>>>>>>>
>>>>>>>>>> @@ -144,6 +188,12 @@
>>>>>>>>>>                  av_log(s, AV_LOG_ERROR, "Extradata corrupted\n");
>>>>>>>>>>                  av_freep(&st->priv_data);
>>>>>>>>>>              }
>>>>>>>>>> +        } else if (st->codec->codec_id == CODEC_ID_SPEEX) {
>>>>>>>>>> +            if (ogg_build_speex_headers(st->codec, oggstream,
>>>>>>>>>> +                                        st->codec->flags & CODEC_FLAG_BITEXACT) < 0) {
>>>>>>>>>> +                av_log(s, AV_LOG_ERROR, "error writing Speex headers\n");
>>>>>>>>>> +                av_freep(&st->priv_data);
>>>>>>>>>> +            }
>>>>>>>>> return error here with the return code of the func :>
>>>>>>>>> Yes, it seems flac miss it too, this needs a fix.
>>>>>>>>>
>>>>>>>>> patch fine otherwise, maybe a micro bump for avformat would be nice.
>>>>>>>> fixed. new patch attached. the new patch also differs in that it
>>>>>>>> overrides the extra_headers field in the Speex header to be 0 since only
>>>>>>>> the 2 required headers are written.
>>>>>>>>
>>>>>>> patch ok if it works :>
>>>>> Ok, back to square one.
>>>>>
>>>>>> Hmm... I've done several more tests and it does not quite work as-is for
>>>>>> all samples.  Here is what I have run into.  The tests so far are for
>>>>>> ogg-to-ogg stream copy.
>>>>>>
>>>>>> - When the source has more than 1 frame per packet, the resulting copy
>>>>>> plays fine with ffmpeg/ffplay but is quick and choppy with speexdec.  I
>>>>>> was able to fix this by modifying the ogg/speex demuxer to set
>>>>>> avctx->frame_size to the number of samples in a packet instead of in a
>>>>>> frame.  I also had to update the libspeex decoder accordingly.  Maybe
>>>>>> this is the wrong way to go about it though.  I'm guessing it is a
>>>>>> timestamp/granulepos issue, but I don't know enough about Ogg to tell
>>>>>> more than that.
>>>>> This is now corrected after much discussion. :)
>>>>>
>>>>>> - Even with the fix and even with 1 frame per packet, 2 short samples
>>>>>> I've tested so far have a single soft pop when the stream-copied file is
>>>>>> decoded with speexdec, but it's fine with ffmpeg/ffplay.
>>>>>>
>>>>>> Maybe someone else might have an idea of what could be going wrong?
>>>>> Now I think I know what is going wrong, and there is nothing we can do
>>>>> about it I think.  speexenc does some weird things with granule
>>>>> positions.  It starts out for a long time with granulepos=0 even though
>>>>> it is encoding audio, then when it starts writing granule positions it
>>>>> is not always in sync with the start of the stream.  Below is a little
>>>>> snippet from a comparison of an original spx file to a copied spx file.
>>>>>  Each packet should be 320 samples.
>>>>>
>>>>> [...]
>>>>>
>>>>> -00:00:00.000: serialno 1626088319, calc. gpos 0, packetno 57
>>>>> +00:00:01.120: serialno 0000000000, granulepos 17920, packetno 57
>>>>>
>>>>> -00:00:00.000: serialno 1626088319, calc. gpos 0, packetno 58
>>>>> +00:00:01.140: serialno 0000000000, granulepos 18240, packetno 58
>>>>>
>>>>> -00:00:00.000: serialno 1626088319, calc. gpos 0, packetno 59
>>>>> +00:00:01.160: serialno 0000000000, granulepos 18560, packetno 59
>>>>>
>>>>> -00:00:01.171: serialno 1626088319, granulepos 18737, packetno 60
>>>>> +00:00:01.180: serialno 0000000000, granulepos 18880, packetno 60
>>>>>
>>>>> -00:00:01.191: serialno 1626088319, calc. gpos 19057, packetno 61
>>>>> +00:00:01.191: serialno 0000000000, granulepos 19057, packetno 61
>>>>>
>>>>> -00:00:01.211: serialno 1626088319, calc. gpos 19377, packetno 62
>>>>> +00:00:01.211: serialno 0000000000, granulepos 19377, packetno 62
>>>> So... I figured it out, but you may not want to know the answer. ;)
>>>>
>>>> The granulepos of the first packet is supposed to be interpreted as
>>>> smaller than the full frame size by calculating what the granulepos of
>>>> the first page would normally be, then subtracting it from what it
>>>> really is to get the delay.
>>>>
>>>>> >From above, this is the last packet in the first page. There are 59
>>>> packets per page in this stream (the first 2 packets are headers, hence
>>>> the packetno of 60).
>>>>> -00:00:01.171: serialno 1626088319, granulepos 18737, packetno 60
>>>>> +00:00:01.180: serialno 0000000000, granulepos 18880, packetno 60
>>>> speexdec interprets the first packet as having a delay of
>>>> 18880-18737=143 samples.  So the first packet should be 320-143=177
>>>> samples long, and the decoder discards the first 143 samples of the
>>>> first frame.
>>>>
>>>> None of this is documented except for in the speexenc and speexdec
>>>> source code.  From analyzing a Speex-in-FLV sample, it appears that the
>>>> way Adobe handles this in Flash Media Server is to do like our ogg
>>>> demuxer does and interpret the first page as if each frame is 320
>>>> samples, then resync timestamps with the source after the first page,
>>>> causing a skip in timestamps after the first page instead of at the
>>>> beginning of the stream.
>>>>
>>>> I'm still not sure what to do about this though...
>>> This patch makes it so that all the pts and durations are correct for
>>> Ogg/Speex.  It basically just changes the durations of the first and
>>> last packets.
>> nevermind. this doesn't quite work. i'm still working on it. damn ogg
>> and its craziness!
> 
> Ok, now this patch should work correctly.

ping.



More information about the ffmpeg-devel mailing list