[FFmpeg-soc] Review of dirac code
Michael Niedermayer
michaelni at gmx.at
Fri Aug 17 00:12:57 CEST 2007
On Thu, Aug 16, 2007 at 11:30:37PM +0200, Marco Gerards wrote:
> Michael Niedermayer <michaelni at gmx.at> writes:
>
> Hi,
>
> > On Thu, Aug 16, 2007 at 07:37:24PM +0200, Marco Gerards wrote:
> > [...]
> >> > [...]
> >> >
> >> > dirac_arith.h:
> >> > [...]
> >> >> typedef struct dirac_arith_state {
> >> >> /* Arithmetic decoding. */
> >> >> unsigned int low;
> >> >> unsigned int range;
> >> >> unsigned int code;
> >> >> unsigned int bits_left;
> >> >> int carry;
> >> >> unsigned int contexts[ARITH_CONTEXT_COUNT];
> >> >>
> >> >> GetBitContext *gb;
> >> >> PutBitContext *pb;
> >> >
> >> > GetBitContext gb;
> >> > PutBitContext pb;
> >> > would avoid a pointer dereference if these 2 are used in the innermost
> >> > loops, though maybe their use can be avoided by reading/writing 8bit
> >> > at a time similar to how our cabac reader and range coder works
> >>
> >> Can I just copy the GetBitContext, PutBitContext at initialization
> >> time, or is there a more efficient way to do this?
> >
> > no, its ok
>
> Do you mean it is ok to copy them?
yes its ok to copy for the arith coder
[...]
> > [...]
> >> >
> >> > this is VERY inefficient, most of what is done here does not need to be done
> >> > per pixel or can be precalculated
> >>
> >> Absolutely! gprof agrees with you ;-)
> >>
> >> Actually, this comment applies to all code above, I assume.
> >>
> >> time seconds seconds calls ms/call ms/call name
> >> 40.85 2.41 2.41 88939320 0.00 0.00 upconvert
> >> 29.49 4.15 1.74 50 34.80 113.59 dirac_decode_frame
> >> 10.34 4.76 0.61 417867 0.00 0.00 motion_comp_block1ref
> >> 6.78 5.16 0.40 540 0.74 0.74 dirac_subband_idwt_53
> >> 4.24 5.41 0.25 8689089 0.00 0.00 coeff_unpack
> >> 4.07 5.65 0.24 8707719 0.00 0.00 dirac_arith_read_uint
> >>
> >> Luca and I talked about if I would concentrate on further
> >> optimizations or on the encoder and he said he preferred having me
> >> working on the encoder for now, because I know Dirac quite well by now
> >> (Luca, please tell me if I am wrong in any way).
> >>
> >> So there is quite some room for optimizations. In the specification,
> >> the used pseudo code looped over the pixels. My current code at least
> >> doesn't do that. But I agree, the MC code is where a lot of
> >> optimizations can be made:
> >>
> >> - The qpel/eightpel interpolation code is very inefficient and perhaps
> >> has to be merged with motion_comp_block1ref and
> >> motion_comp_block2ref. In that case it can be optimized. We
> >> especially have to take care of border cases to avoid clipping, more
> >> or less like I did for halfpel interpolation.
> >
> > simply making upconvert() work not with 1 pixel but with a block of pixels
> > should make it alot more efficient
>
> I moved this into motion_comp_block1ref/motion_comp_block2ref. This
> works quite well and boosts the speed of the decoder by 50% and makes
> further optimizations easier. Hopefully you do not have problems with
> this approach.
iam fine with it
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
Good people do not need laws to tell them to act responsibly, while bad
people will find a way around the laws. -- Plato
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-soc/attachments/20070817/7e18396e/attachment.pgp>
More information about the FFmpeg-soc
mailing list