[FFmpeg-devel] [PATCH] Electronic Arts MAD Decoder
Reimar Döffinger
Reimar.Doeffinger
Sun Jun 7 11:53:38 CEST 2009
On Sun, Jun 07, 2009 at 04:24:59PM +1000, Peter Ross wrote:
> +static inline void mad_comp(unsigned char *dst, int dst_stride,
> + unsigned char *src, int src_stride,
> + int add){
> + int j,i;
> + for(j=0;j<8;j++)
> + for(i=0;i<8;i++)
> + dst[j*dst_stride + i] = av_clip_uint8(src[j*src_stride + i] + add);
Probably the compiler can figure it out by itself, but you sure don't
want the multiplies in the inner loop but instead something like
> + for(i=0;i<8;i++)
> + dst[i] = av_clip_uint8(src[i] + add);
> + src += src_stride;
> + dst += dst_stride;
> + uint8_t *dest_y = t->frame.data[0] + (mb_y * 16* linesize ) + mb_x * 16;
> + uint8_t *dest_cb = t->frame.data[1] + (mb_y * 8 * t->frame.linesize[1]) + mb_x * 8;
> + uint8_t *dest_cr = t->frame.data[2] + (mb_y * 8 * t->frame.linesize[2]) + mb_x * 8;
Not sure if it makes things uglier, but you could factor out the * 8 / * 16 here.
> + int ref_linesize = t->last_frame.linesize[0];
> + uint8_t *ref_y = t->last_frame.data[0] + ((mb_y * 16+ mv_y) * ref_linesize ) + mb_x * 16 + mv_x;
> + uint8_t *ref_cb = t->last_frame.data[1] + ((mb_y * 8 + (mv_y/2))* t->last_frame.linesize[1]) + mb_x * 8 + (mv_x/2);
> + uint8_t *ref_cr = t->last_frame.data[2] + ((mb_y * 8 + (mv_y/2))* t->last_frame.linesize[2]) + mb_x * 8 + (mv_x/2);
Here it probably isn't such a good idea though.
> + case 0: mad_comp(dest_y , linesize,ref_y , ref_linesize, add); break;
> + case 1: mad_comp(dest_y + 8, linesize,ref_y + 8, ref_linesize, add); break;
> + case 2: mad_comp(dest_y + 8*linesize , linesize,ref_y + 8*linesize , ref_linesize, add); break;
> + case 3: mad_comp(dest_y + 8*linesize + 8, linesize,ref_y + 8*linesize + 8, ref_linesize, add); break;
If it doesn't affect performance too much this could be
case 0:
case 1:
case 2:
case 3:
offset = ((j & 2) << 2) * linesize + ((j & 1) << 3);
mad_comp(dest_y + offset, linesize, ref_y + offset, ref_linesize, add); break;
except that I think your code is wrong and you should be using
ref_linesize for the ref_y offset.
Still, I think that any code that has more than one mad_comp for this is
bloaty.
> +static inline void mad_idct_put(MadContext *t, DCTELEM *block, int mb_x, int mb_y, int i){
> + MpegEncContext *s = &t->s;
> + int linesize= t->frame.linesize[0];
> + uint8_t *dest_y = t->frame.data[0] + (mb_y * 16* linesize ) + mb_x * 16;
> + uint8_t *dest_cb = t->frame.data[1] + (mb_y * 8 * t->frame.linesize[1]) + mb_x * 8;
> + uint8_t *dest_cr = t->frame.data[2] + (mb_y * 8 * t->frame.linesize[2]) + mb_x * 8;
> +
> + switch(i) {
> + case 0: s->dsp.idct_put(dest_y , linesize, block); break;
> + case 1: s->dsp.idct_put(dest_y + 8, linesize, block); break;
> + case 2: s->dsp.idct_put(dest_y + 8*linesize , linesize, block); break;
> + case 3: s->dsp.idct_put(dest_y + 8*linesize + 8, linesize, block); break;
> + case 4: if(!(s->avctx->flags&CODEC_FLAG_GRAY))
> + s->dsp.idct_put(dest_cb, t->frame.linesize[1], block);
> + break;
> + case 5: if(!(s->avctx->flags&CODEC_FLAG_GRAY))
> + s->dsp.idct_put(dest_cr, t->frame.linesize[2], block);
> + break;
> + }
On thinking again about it, my feeling is that this is just the wrong
way round.
Calculating dst and linesize should be in the switch, not the function
calls. A stupid compiler might e.g. calculate a dest_cb that is never used.
The only issue is with the CODEC_FLAG_GRAY, though you could still do
case 4:
if (s->avctx->flags&CODEC_FLAG_GRAY) return;
dest = t->frame.data[1] + (mb_y * 8 * t->frame.linesize[1]) + mb_x * 8;
linesize = t->frame.linesize[1];
Though I really don't like much having that check in an inner loop...
> + OPEN_READER(re, &s->gb);
> + /* now quantify & encode AC coefficients */
> + for(;;) {
> + UPDATE_CACHE(re, &s->gb);
I know a lot of codecs use it, but is using this extremely lowlevel API
really the best way to do? Seems to me like it will result in a lot of
buggy an badly maintainable code.
> + /* escape */
> + level = SHOW_SBITS(re, &s->gb, 10); SKIP_BITS(re, &s->gb, 10);
E.g. those are 20 bits in a row, I think 17 is the allowed maximum, so
you'd need an UPDATE_CACHE in-between?
> + avctx->time_base.num = AV_RL16(&buf[6]);
> + avctx->time_base.den = 1000;
Hm, shouldn't you use av_reduce on those?
> + if (s->avctx->width!=s->width || s->avctx->height!=s->height) {
> + avcodec_set_dimensions(s->avctx, s->width, s->height);
avcodec_check_dimensions?
> + t->frame.buffer_hints = FF_BUFFER_HINTS_VALID;
I see other codecs have it, too, but is there a point in only
specifying FF_BUFFER_HINTS_VALID?
> + init_get_bits(&s->gb, t->bitstream_buf, 8*(buf_end-buf));
Hm... are you actually checking for the buffer size anywhere where you
read from it?
> + if (chunk_type!=MADe_TAG)
> + FFSWAP(AVFrame, t->frame, t->last_frame);
Hm? You are using last_frame but set neither FF_BUFFER_HINTS_PRESERVE
nor FF_BUFFER_HINTS_READABLE?
More information about the ffmpeg-devel
mailing list