[FFmpeg-user] help remuxing mp4 to matroska

Eric Gillum ericwgillum at gmail.com
Thu Sep 6 22:47:31 CEST 2012


Hello,

I am not an ffmpeg or multimedia expert. Bear with me. I am trying to
transmux an mp4 file with h264 video and aac audio to matroska using
libavformat, libavcodec, etc (i.e. not command line ffmpeg). My
understanding is that no decoding or encoding is required. That is, I
believe the matroska container can handle h264/aac, and that I have
these bytes available, and therefore this should be more or less a
straightforward copy. A quick sanity check of the file with ffprobe:

Stream #0:0(und): Video: h264 (Baseline) (avc1 / 0x31637661)
Stream #0:1(und): Audio: aac (mp4a / 0x6134706D)

So first of all, what I guess must be transmuxing via command line is
"ffmpeg -i 0.mp4 -vcodec copy -acodec copy 0.mkv". That gives me a
playable file -- VLC can read and play it, and my player can read it
(video plays back at least one frame and my implementation is hokey
anyway; audio plays back fine).

So I'm trying to transmux with libavformat / libavcodec. The code I
use to transmux appears last in this message. Encouragingly, on
opening the original file to transmux, I get these logs:

[mov,mp4,m4a,3gp,3g2,mj2 @ 0x212fa00] Format mov,mp4,m4a,3gp,3g2,mj2
probed with size=2048 and score=100
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x212fa00] ISO: File Type Major Brand: mp42
[h264 @ 0x20f2600] err{or,}_recognition separate: 1; 1
[h264 @ 0x20f2600] err{or,}_recognition combined: 1; 65537
[aac @ 0x20e9400] err{or,}_recognition separate: 1; 1
[aac @ 0x20e9400] err{or,}_recognition combined: 1; 65537
[aac @ 0x20e9400] Unsupported bit depth: 0

I get those same logs for the command-line-transmuxed files, so no
obvious problem. And when transmuxing I get logs like these:

[matroska @ 0x20fbe00] Writing block at offset 15, size 7349, pts 0,
dts 0, duration 40, flags 128
[matroska @ 0x20fbe00] Writing block at offset 21, size 4, pts 0, dts
0, duration 1024, flags 128
[matroska @ 0x20fbe00] Writing block at offset 31, size 4395, pts 40,
dts 40, duration 40, flags 0
<snip>
[matroska @ 0x20fbe00] Starting new cluster at offset 175 bytes, pts 1024
[matroska @ 0x20fbe00] Writing block at offset 16, size 331, pts 1024,
dts 1024, duration 1024, flags 128
[matroska @ 0x20fbe00] Writing block at offset 354, size 2169, pts
1040, dts 1040, duration 40, flags 0
<snip>
[libx264 @ 0x2314800] final ratefactor: 29.14

I get no warnings, no errors, etc. But the output file appears to be
mostly junk. ffprobe detects the video dimensions, the audio sample
rate, some scant codec stuff about h264 and aac...but generally
information is reported as n/a. If I try to play this file, VLC does
nothing and logs nothing. My own player appears not to be able to read
the header. Here are logs when I open the output file:

[matroska,webm @ 0x234ac00] Format matroska,webm probed with size=2048
and score=100
st:0 removing common factor 1000000 from timebase
st:1 removing common factor 1000000 from timebase
[h264 @ 0x234a800] err{or,}_recognition separate: 1; 1
[h264 @ 0x234a800] err{or,}_recognition combined: 1; 65537
[swscaler @ 0x6358000] No accelerated colorspace conversion found from
yuv420p to rgba.
[aac @ 0x2314200] err{or,}_recognition separate: 1; 1
[aac @ 0x2314200] err{or,}_recognition combined: 1; 65537
[aac @ 0x2314200] Unsupported bit depth: 0
[h264 @ 0x234a800] Unknown NAL code: 25 (3009 bits)
[h264 @ 0x234a800] no frame!
[h264 @ 0x234a800] no frame!

>From my own logs (not shown) I can tell it reads very far into the
file before reporting "Unknown NAL code", whereas on a good file it
seems that my player gets a lot of what it needs from the first <4k in
the file. As it is, this plays about 0.5 seconds of (correct) audio
and then stops decoding due to errors. No video frames are decoded.

So before I post the code, here are my questions:
- Do I have the right idea about transmuxing from mp4 to matroska?
- In the code you'll see that I do not ever call encode_video or
decode_video. Perhaps I must still call these, and trust that no
(expensive) reencoding occurs? How to I translate the "vcodec -copy"
semantics?
- Is there sample code or instructions on how to transmux with the
libs? My search results only mention command line.

And some last second details:
- Built the libs from ffmpeg 0.9.1 (harmony) with debug logging.
- Built with libx264.
- The code below has been stripped of most comments, error checking,
etc. I've already verified that no obvious error occurs. With the
exception of my transmuxing code, my code runs and works. In fact,
I've written code to encode in matroska format and then decode and
play back. This is based on that, but basically with the encode_video
and decode_video parts stripped out.
- The code below omits IO, file processing, callbacks etc. Nothing
interesting there (nothing ffmpeg-related).

// Build a format context for writing (i.e. transmuxing).
// ------------------------------------------------
AVFormatContext *oc = avformat_alloc_context();

// Matroska format.
AVOutputFormat *fmt = av_guess_format("matroska", NULL, NULL);
fmt->audio_codec = CODEC_ID_AAC;
oc->oformat = fmt;

// Make ffmpeg call me back when ready to write data.
AVIOContext *ioctx =
avio_alloc_context(av_malloc(CONTEXT_BUFFER_SIZE), /* buffer */
                                        CONTEXT_BUFFER_SIZE, /* buffer size */
                                        1, /* write flag */
                                        opaque, /* opaque */
                                        NULL, /* read packet */
                                        callback, /* write packet */
                                        NULL /* seek */);
ioctx->seekable = 0;
oc->pb = ioctx;

// Add video stream.
AVStream *video_st = avformat_new_stream(oc, NULL);
AVCodecContext *c = video_st->codec;
c->codec_id = CODEC_ID_H264;
c->codec_type = AVMEDIA_TYPE_VIDEO;
c->pix_fmt = PIX_FMT_YUV420P;
c->width = 192;
c->height = 144;
c->time_base.den = frameRateForCurrentDevice();
c->time_base.num = 1;
c->bit_rate = 256000; // 512000
c->bit_rate_tolerance = 128000; // 2 * c->bit_rate;
c->refs = 4;
c->gop_size = 4;
c->max_b_frames = 4;
c->me_range = 16;
c->max_qdiff = 4;
c->qmin = 10;
c->qmax = 30;
c->qcompress = 0.6;
if (oc->oformat->flags & AVFMT_GLOBALHEADER) {
    c->flags |= CODEC_FLAG_GLOBAL_HEADER;
}

// Open video codec.
AVCodec *codec = avcodec_find_encoder(c->codec_id);
avcodec_open2(c, codec, NULL);

// Add audio stream.
AVStream *audio_st = avformat_new_stream(oc, NULL);
c = audio_st->codec;
c->codec_id = CODEC_ID_AAC;
c->codec_type = AVMEDIA_TYPE_AUDIO;
c->sample_fmt = AV_SAMPLE_FMT_S16;
c->bit_rate = 32000;
c->bit_rate_tolerance = 16000;
c->sample_rate = 44100;
c->channels = 1;
if (oc->oformat->flags & AVFMT_GLOBALHEADER) {
    c->flags |= CODEC_FLAG_GLOBAL_HEADER;
}

// Open audio.
codec = avcodec_find_encoder(c->codec_id);
avcodec_open2(c, codec, NULL);

avformat_write_header(oc, NULL);

// Build a format context for reading, based on the file.
// ------------------------------------------------
char const *filename = "whatever";
AVFormatContext *read_ctx = NULL;
avformat_open_input(&read_ctx, filename, NULL, NULL);
avformat_find_stream_info(read_ctx, NULL);
// Find video stream.
int read_video_stream_idx = -1;
for (int i = 0; i < read_ctx->nb_streams; i++) {
    if (read_ctx->streams[i]->codec->codec_type == AVMEDIA_TYPE_VIDEO) {
        read_video_stream_idx = i;
        break;
    }
}
// Find audio stream.
int read_audio_stream_idx = -1;
for (int i = 0; i < read_ctx->nb_streams; i++) {
    if (read_ctx->streams[i]->codec->codec_type == AVMEDIA_TYPE_AUDIO) {
        read_audio_stream_idx = i;
        break;
    }
}

while (1) {
    AVPacket pkt;
    av_init_packet(&pkt);
    int ret;
    if ((ret = av_read_frame(read_ctx, &pkt)) < 0) {
        break;
    }

    // Update stream index for writing.
    pkt.stream_index = (pkt.stream_index == read_video_stream_idx) ?
video_st->index : audio_st->index;
    ret = av_interleaved_write_frame(oc, &pkt);
    av_free_packet(&pkt);
}

avio_flush(oc->pb);
av_write_trailer(oc);


More information about the ffmpeg-user mailing list