[FFmpeg-soc] [PATCH] xiph packetizer
Martin Storsjö
martin at martin.st
Wed Aug 4 13:55:54 CEST 2010
On Tue, 3 Aug 2010, Martin Storsjö wrote:
> On Mon, 2 Aug 2010, Josh Allmann wrote:
>
> > I don't know exactly what I did, but this round adds in support for
> > multiple Xiph frames per packet, and the tcp issue is gone. Pebkac
> > most likely.
>
> Hmm, weird.
>
> Now I'm getting some issues with vorbis audio with this patch. I'll debug
> it and see what's going wrong in a while - it may just as well be
> something in my setup, but I suspect something with multiple frames per
> packet.
Yes, it was issues with multiple frames per packet - in the depacketizer
actually. How did you test this - it didn't seem to work at all for me?
After the two patches I sent to -devel, it seems to work just fine,
though.
Except that, it seems to work in general. Relatively thorough review
below:
> @@ -135,6 +137,14 @@ static int rtp_write_header(AVFormatContext *s1)
> s->nal_length_size = (st->codec->extradata[4] & 0x03) + 1;
> }
> break;
> + case CODEC_ID_VORBIS:
> + case CODEC_ID_THEORA:
> + if (!s->max_frames_per_packet) s->max_frames_per_packet = 15;
> + s->max_frames_per_packet = av_clip(s->max_frames_per_packet, 1, 15);
> + s->max_payload_size -= 6; // ident+frag+tdt/vdt+pkt_num+pkt_length
> + s->num_frames = 0;
> + if (st->codec->codec_id == CODEC_ID_VORBIS) goto defaultcase;
> + break;
I think you could do this for both codecs - the default case sets buf_ptr,
which needs to be set for theora too.
> diff --git a/libavformat/rtpenc_xiph.c b/libavformat/rtpenc_xiph.c
> new file mode 100644
> index 0000000..989354f
> --- /dev/null
> +++ b/libavformat/rtpenc_xiph.c
> @@ -0,0 +1,117 @@
> +/*
> + * RTP packetization for Xiph audio and video
> + * Copyright (c) 2010 Josh Allmann
> + *
> + * This file is part of FFmpeg.
> + *
> + * FFmpeg is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation; either
> + * version 2.1 of the License, or (at your option) any later version.
> + *
> + * FFmpeg is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with FFmpeg; if not, write to the Free Software
> + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
> + */
> +
> +#include "avformat.h"
> +#include "rtpenc.h"
> +
> +/**
> + * Packetize Xiph frames into RTP according to
> + * RFC 5215 (Vorbis) and the Theora RFC draft.
> + * (http://svn.xiph.org/trunk/theora/doc/draft-ietf-avt-rtp-theora-00.txt)
> + */
> +void ff_rtp_send_xiph(AVFormatContext *s1, const uint8_t *buff, int size)
> +{
> + RTPMuxContext *s = s1->priv_data;
> + int max_pkt_size, xdt, frag;
> + uint8_t *q;
> +
> + max_pkt_size = s->max_payload_size;
> +
> + /* set xiph data type */
> + switch (*buff) {
> + case 0x01: // vorbis id
> + case 0x05: // vorbis setup
> + case 0x80: // theora header
> + case 0x82: // theora tables
> + xdt = 1; // packed config payload
> + break;
> + case 0x03: // vorbis comments
> + case 0x81: // theora comments
> + xdt = 2; // comment payload
> + break;
> + default:
> + xdt = 0; // raw data payload
> + }
The indentation of the case statements usually is at the same level
as the switch statement itself in ffmpeg. Also, a break at the end of
the default case wouldn't hurt I think.
> +
> + /* Set ident. Must match the one in sdp.c
> + * Probably need a non-fixed way of generating
> + * this, but it has to be done in SDP and passed in from there. */
> + q = s->buf;
> + *q++ = 0xfe;
> + *q++ = 0xcd;
> + *q++ = 0xba;
> +
> + /* set fragment
> + * 0 - whole frame (possibly multiple frames)
> + * 1 - first fragment
> + * 2 - fragment continuation
> + * 3 - last fragmement */
> + frag = size <= max_pkt_size ? 0 : 1;
> +
> + if (s->num_frames && (xdt || frag)) {
> + /* immediately send any buffered frames
> + * if buffer is not raw data, or if current frame is fragmented. */
> + ff_rtp_send_data(s1, s->buf, s->buf_ptr - s->buf, 0);
> + }
You could move this if clause to below the if (!frag && !xdt), then this
one would be only if (s->num_frames).
> +
> + if (!frag && !xdt) { // do we have a whole frame of raw data?
> + int remaining = max_pkt_size - ((int)(s->buf_ptr - s->buf) + size);
This calculation is slightly off - max_pkt_size already has the 6 bytes
header subtracted, but those 6 bytes still are included in s->buf_ptr - s->buf.
Also, it feels a bit convoluted.
If you happen to have a packet of e.g. 1453, this forces the lines below
to send the currently buffered data, even if there isn't any frame buffered
(s->buf_ptr - s->buf is 3 bytes, and only contains the ident header).
> + if (remaining < 0 || s->num_frames >= s->max_frames_per_packet) {
> + /* send previous packets now; no room for new data */
> + ff_rtp_send_data(s1, s->buf, s->buf_ptr - s->buf, 0);
> + s->num_frames = 0;
> + }
> +
> + /* buffer current frame to send later */
> + if (0 == s->num_frames) s->timestamp = s->cur_timestamp;
> + s->num_frames++;
> + *q++ = s->num_frames; // set packet header
Some comment mentioning that frag and xdt should be or'ed in here, too,
but omitted since they're both 0, wouuld help the readability.
> + if (s->num_frames > 1) q = s->buf_ptr; // jump ahead if needed
> + *q++ = (size >> 8) & 0xff;
> + *q++ = size & 0xff;
> + memmove(q, buff, size);
Why memmove? I don't think there's any theoretical case where the buffers
could overlap?
> + q += size;
> + s->buf_ptr = q;
> + return;
> + }
> +
> + s->timestamp = s->cur_timestamp;
> + s->num_frames = 0;
> + s->buf_ptr = q;
> + while (size > 0) {
> + int len = (!frag || frag == 3) ? size : max_pkt_size;
> + q = s->buf_ptr;
> +
> + /* set packet headers */
> + *q++ = (frag << 6) | (xdt << 4);
> + *q++ = (len >> 8) & 0xff;
> + *q++ = len & 0xff;
> + /* set packet body */
> + memmove(q, buff, len);
Same here
> + q += len;
> + buff += len;
> + size -= len;
> +
> + ff_rtp_send_data(s1, s->buf, q - s->buf, 0);
> +
> + frag = size <= max_pkt_size ? 3 : 2;
> + }
> +}
> diff --git a/libavformat/rtsp.c b/libavformat/rtsp.c
> index 689ad29..1be3cd0 100644
> --- a/libavformat/rtsp.c
> +++ b/libavformat/rtsp.c
> @@ -52,7 +52,7 @@ int rtsp_default_protocols = (1 << RTSP_LOWER_TRANSPORT_UDP);
> #define SELECT_TIMEOUT_MS 100
> #define READ_PACKET_TIMEOUT_S 10
> #define MAX_TIMEOUTS READ_PACKET_TIMEOUT_S * 1000 / SELECT_TIMEOUT_MS
> -#define SDP_MAX_SIZE 8192
> +#define SDP_MAX_SIZE 16384
>
Is there any limit on the size of the extradata for theora/vorbis?
Can we be reasonably sure this is enough for at least one theora stream
plus one vorbis stream?
> diff --git a/libavformat/sdp.c b/libavformat/sdp.c
> index b34b944..acd954a 100644
> --- a/libavformat/sdp.c
> +++ b/libavformat/sdp.c
> @@ -21,6 +21,7 @@
> #include <string.h>
> #include "libavutil/avstring.h"
> #include "libavutil/base64.h"
> +#include "libavcodec/xiph.h"
> #include "avformat.h"
> #include "internal.h"
> #include "avc.h"
> @@ -220,6 +221,68 @@ static char *extradata2config(AVCodecContext *c)
> return config;
> }
>
> +static char *xiph_extradata2config(AVCodecContext *c)
> +{
> + char *config, *encoded_config;
> + uint8_t *header_start[3];
> + int headers_len, header_len[3], config_len;
> + int first_header_size;
> +
> + switch (c->codec_id) {
> + case CODEC_ID_THEORA:
> + first_header_size = 42;
> + break;
> + case CODEC_ID_VORBIS:
> + first_header_size = 30;
> + break;
> + default:
> + av_log(c, AV_LOG_ERROR, "Unsupported Xiph codec ID\n");
> + return NULL;
> + }
> +
> + if (ff_split_xiph_headers(c->extradata, c->extradata_size,
> + first_header_size, header_start,
> + header_len) < 0) {
> + av_log(c, AV_LOG_ERROR, "Extradata corrupt.");
Add a newline to the log message
> + return NULL;
> + }
> +
> + headers_len = header_len[0]+header_len[2];
Some space around the + would make it nicer to read :-)
> + config_len = 4 + // count
> + 3 + // ident
> + 2 + // packet size
> + 1 + // header count
> + 2 + // header size
> + headers_len; // and the rest
> + config = av_malloc(config_len);
> + encoded_config = av_malloc(AV_BASE64_SIZE(config_len));
> +
> + if (!config || !encoded_config) {
> + av_log(c, AV_LOG_ERROR,
> + "Not enough memory for configuration string\n");
If either of them was allocated, but not the other, you'd leak memory here
> + return NULL;
> + }
> +
// Martin
More information about the FFmpeg-soc
mailing list