[FFmpeg-devel] [PATCH] Bink file demuxer and audio decoder

Mon Jan 19 14:34:07 CET 2009

On Wed, Jan 14, 2009 at 02:39:28AM -0600, Daniel Verkamp wrote:
> Hi,
> 
> Attached are patches to add a Bink file demuxer and Bink audio
> decoder, plus support for Bink audio in Smacker files.
> 
> First of all, I must say that the bulk of these patches is based on
> the previous patch by Peter Ross [1]; many thanks to him and anyone
> else who helped write the documentation on MultimediaWiki.  I've
> attempted to update the documentation where I've noticed inaccuracies,
> but I give no guarantees that documentation matches code. :)  I've
> tried to fix most of the problems mentioned in the previous thread,
> but if I have missed anything important, please point it out.
> 
> The first patch is the most problematic; it implements the required
> DCT and RDFT functions, but for now they are just wrappers around the
> (incompatibly-licensed) fft4g functions the old patch was using.
> Hopefully someone will step up to write replacements for these.  I
> attempted that myself, but I am no mathematician, and I ended up with
> a big mess and a thoroughly muddled brain trying to implement anything
> based on the descriptions I could find.
> 
> I have only tested the Bink-in-Smacker support with files I created
> myself with the latest RAD Video Tools from their website (and it does
> work for those - samples can be found at http://drv.nu/temp/SMK/ ); if
> anyone has samples of this from actual games, these would be greatly
> appreciated.  A small note on the Smacker demuxer: it was using the
> SMKA tag even for u8 PCM audio; this didn't seem right, and I haven't
> been able to find or create a file with such an audio format, so I'm
> not sure it exists at all.  Any samples of this would be interesting
> too.
> 
> I've tested at least a few samples from every game I have available
> that uses Bink, namely:
> 
> - Heroes of Might and Magic 3: The files are hidden away inside .vid
> archives; they are the only revision 'b' (presumably "beta"?) I could
> find, and this format is not playable by the RAD player; none seem to
> have audio, unless the headers are different, so the demuxer currently
> doesn't support these (a trivial one-line patch of the probe function
> will enable it, but there's no way to test if it's actually correct
> for now)
> - Test Drive 6 (PC version): The only example I could find among my
> games of files with the oldest (BIKf) non-beta revision; features a
> music video of Fear Factory with Gary Numan performing "Cars" :)
> - Diablo 2: Samples I tested all work; see note below
> - GTA IV: some very large samples (TV stations), notably with
> keyframes; I haven't listened to the entire hour and a half, but the
> parts I checked were fine
> - Various other games, all of which work: C&C: Generals, L4D,
> Psychonauts, COD4, UT3, ...
> 
> I also tested the samples at http://samples.mplayerhq.hu/game-formats/bink/ :
> 
> - ATI/* has various chunks of files; appending the
> --append.this.to.complete.bik to the end of the un-suffixed file
> produces the full video, which works; the .x file is just another
> truncated part of the video
> - thps4/data002.tgr seems to be the only file in that dir that has
> audio, and it works
> - diablo2/d2xmusic.mpq does not contain any Bink files, but
> D2VIDEO.MPQ from the Cinematics CD does contain Bink files, and these
> work, as do the ones from the expansion
> - ActivisionLogo.bik, AnivisionLogo.bik, logo_collective.bik,
> logo_lucas.bik, original.bik: all work
> - logo_legal.bik: does not contain any audio
> 
> Any samples that don't work (or reports of ones that actually work :),
> as well as any critiques on the patch, are quite welcome.

[...]

> +// from wmadata.h
> +static const uint16_t wma_critical_freqs[25] = {

if its "from" that means its copy & pasted, but instead it should be
shared

> +    100,   200,  300, 400,   510,  630,  770,    920,
> +    1080, 1270, 1480, 1720, 2000, 2320, 2700,   3150,
> +    3700, 4400, 5300, 6400, 7700, 9500, 12000, 15500,
> +    24500,
> +};
> +
> +#define MAX_CHANNELS 2
> +

> +#define BLOCK_MAX_SIZE ((1 << 11) * MAX_CHANNELS)

MAX_CHANNELS << 11

> +
> +typedef struct {
> +    AVCodecContext *avctx;
> +    GetBitContext gb;
> +    DSPContext dsp;
> +    int first;
> +    int use_dct;
> +    int frame_len;          ///< transform size (samples)
> +    int overlap_len;        ///< overlap size (samples)
> +    int channels;
> +    int num_bands;
> +    unsigned int *bands;
> +    float root;

> +    DECLARE_ALIGNED_16(short, window[BLOCK_MAX_SIZE / 16]);

this is no window, this are the samples of the previous frame

[...]
> +    // windowing
> +    if (!s->first) {
> +        int count = s->overlap_len * channels;
> +        for (i = 0; i < count; i++) {
> +            out[i] = (s->window[i] * (count - i) + out[i] * i) / count;
> +        }

/ is slow, use >> or * >>

> +    }
> +    memcpy(s->window,
> +           out + (s->frame_len - s->overlap_len) * channels,
> +           s->overlap_len * channels * sizeof(short));

sizeof(*out) is more robust

[...]
> +static int binkaudio_decode_frame(AVCodecContext *avctx, void *outdata, int *outdata_size, const uint8_t *buf, int buf_size)
> +{
> +    BinkAudioContext *s = avctx->priv_data;
> +    short *samples = outdata;
> +    int frame_size;
> +    GetBitContext *gb = &s->gb;
> +
> +    init_get_bits(gb, buf, buf_size * 8);
> +

> +    frame_size = get_bits(gb, 32);
> +    if (frame_size > *outdata_size) {
> +        av_log(avctx, AV_LOG_ERROR, "insufficient output buffer size\n");
> +        return -1;
> +    }

this check is broken

> +
> +    *outdata_size = frame_size;
> +    while (get_bits_count(gb) / 8 < buf_size) {
> +        samples += decode_block(s, samples);
> +        get_bits_align32(gb);
> +    }

exploitable

[...]
> +static int bink_read_header(AVFormatContext *s, AVFormatParameters *ap)
> +{
> +    BinkDemuxContext *bink = s->priv_data;
> +    ByteIOContext *pb = s->pb;
> +    uint32_t fps_num, fps_den;
> +    AVStream *vst, *ast;
> +    unsigned int i;
> +    uint32_t pos, prev_pos;
> +    uint16_t audio_flags;
> +    int keyframe;
> +
> +    vst = av_new_stream(s, 0);
> +    if (!vst)
> +        return AVERROR(ENOMEM);
> +
> +    vst->codec->codec_tag = get_le32(pb);
> +
> +    bink->file_size = get_le32(pb) + 8;
> +    bink->total_frames = get_le32(pb);
> +

> +    if (bink->total_frames > 1000000) {
> +        av_log(s, AV_LOG_ERROR, "invalid header: more than 1000000 frames\n");
> +        return AVERROR(EIO);
> +    }

in what way is a file invalid that has more frames?

> +
> +    if (get_le32(pb) > bink->file_size) {
> +        av_log(s, AV_LOG_ERROR,
> +               "invalid header: largest frame size greater than file size\n");
> +        return AVERROR(EIO);
> +    }
> +
> +    url_fskip(pb, 4);
> +

> +    vst->codec->width  = get_le32(pb);
> +    if (vst->codec->width > 32767) {
> +        av_log(s, AV_LOG_ERROR,
> +               "invalid header: width %d greater than 32767\n",
> +               vst->codec->width);
> +        return AVERROR(EIO);
> +    }
> +
> +    vst->codec->height = get_le32(pb);
> +    if (vst->codec->height > 32767) {
> +        av_log(s, AV_LOG_ERROR,
> +        "invalid header: height %d greater than 32767\n",
> +        vst->codec->height);
> +        return AVERROR(EIO);
> +    }

similarly, how is a file invalid when it has a larger w/h?

> +
> +    fps_num = get_le32(pb);
> +    fps_den = get_le32(pb);
> +
> +    if (fps_num == 0 || fps_den == 0) {
> +        av_log(s, AV_LOG_ERROR,
> +               "invalid header: invalid fps (%d/%d)\n",
> +               fps_num, fps_den);
> +        return AVERROR(EIO);
> +    }
> +
> +    url_fskip(pb, 4);
> +
> +    vst->codec->codec_type = CODEC_TYPE_VIDEO;
> +    vst->codec->codec_id   = 0; /* FIXME: CODEC_ID_BINKVIDEO */
> +    av_set_pts_info(vst, 64, fps_den, fps_num);
> +
> +    bink->num_audio_tracks = get_le32(pb);
> +

> +    if (bink->num_audio_tracks > BINK_MAX_AUDIO_TRACKS) {
> +        av_log(s, AV_LOG_ERROR,
> +               "invalid header: more than 256 audio tracks (%d)\n",
                                             ^^^
should be BINK_MAX_AUDIO_TRACKS

> +               bink->num_audio_tracks);
> +        return AVERROR(EIO);
> +    }
> +

> +    if (bink->num_audio_tracks) {

the if() seems useless

> +        url_fskip(pb, 4 * bink->num_audio_tracks);
> +
> +        for (i = 0; i < bink->num_audio_tracks; i++) {
> +            ast = av_new_stream(s, 1);
> +            if (!ast)
> +                return AVERROR(ENOMEM);
> +            ast->codec->codec_type  = CODEC_TYPE_AUDIO;
> +            ast->codec->codec_id    = CODEC_ID_BINKAUDIO;
> +            ast->codec->codec_tag   = 0;
> +            ast->codec->sample_rate = get_le16(pb);
> +            av_set_pts_info(ast, 64, 1, ast->codec->sample_rate);

> +            audio_flags = get_le16(pb);
> +            ast->codec->channels = audio_flags & BINK_AUD_STEREO ? 2 : 1;
> +            if (audio_flags & BINK_AUD_USEDCT) {
> +                ast->codec->extradata = av_malloc(BINK_EXTRADATA_SIZE);
> +                ast->codec->extradata_size = BINK_EXTRADATA_SIZE;
> +                *ast->codec->extradata = 1;
> +            }

why dont you read the flags into extradata ?

[...]
> +static int bink_read_packet(AVFormatContext *s, AVPacket *pkt)
> +{
> +    BinkDemuxContext *bink = s->priv_data;
> +    ByteIOContext *pb = s->pb;
> +    int ret;
> +
> +    if (bink->current_track < 0) {
> +        int index_entry;
> +        AVStream *st = s->streams[0]; // stream 0 is video stream with index
> +
> +        if (bink->video_pts >= bink->total_frames)
> +            return AVERROR(EIO);
> +
> +        index_entry = av_index_search_timestamp(st, bink->video_pts,
> +                                                AVSEEK_FLAG_ANY);
> +        if (index_entry < 0) {
> +            av_log(s, AV_LOG_ERROR,
> +                   "could not find index entry for frame %d\n",
> +                   bink->video_pts);
> +            return AVERROR(EIO);
> +        }
> +
> +        bink->remain_packet_size = st->index_entries[index_entry].size;
> +        bink->current_track = 0;
> +    }
> +
> +    if (bink->current_track < bink->num_audio_tracks) {

> +        uint32_t audio_size = get_le32(pb);
> +        if (4 + audio_size > bink->remain_packet_size) {

4 + audio_size can overflow

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

The worst form of inequality is to try to make unequal things equal.
-- Aristotle
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20090119/19ed91e6/attachment.pgp>