[FFmpeg-devel] 4xm idct computation

yann.lepetitcorps at free.fr yann.lepetitcorps at free.fr
Thu Dec 29 00:06:26 CET 2011

> > > > This is my problem, I search something that is "only partially block
> based"
> > > :)
> > > > (cf. that work on fixed 8x8 blocs, but where the blocs are dynamically
> > > > constructed from a "mipmapped" picture)
> >
> >
> > > indeo4 uses block based haar transform amongth other things
> > > see http://wiki.multimedia.cx/index.php?title=Indeo_4
> > >
> > > the haar transform does not perform very well though.
> > >
> > >
> > > and there are fraktal coders that work by downscaling a simple trivial
> > > image and then using blocks from it to construct a new image repeatly
> > > to build up the image one wants.
> > > These are too slow for practical use though.
> >
> > Thanks,
> >
> > This seem a good scheme, but what is the scale of the "too slow" ?
> the problem with fraktal coders is the encoding step. It is VERY slow
> so they are not used in practice anywhere AFAIK (someone will probably
> reply and point out some real world useage if there is one)

It's why I think :

 1) make only one iteration of the wavelet transform on patch of 8x8 pixels
    (this can work on very littles caches of 8x8=64 bytes and this use only
bytes, not floats ... so this is really very speed)

 2) reorder this picture into something like a "mipmap picture" where the
horizontal and  vertical coefficients replace the red and blue parts, and where
the average coefficients replace the green part
(cf. look similar to a real wavelet picture, but only with the first 8 levels)

 3) remake a second iteration of the wavelet trnasform into this "mipmapped

 4) reorder the resultant picture for to have always in output something that is

A limitation of this scheme  is that the width and height are to be always
multiples of 64

=> I think that this "only two recursions wavelet transform" can be computed
very quickly (because the only one recursion version is **really** very speed ..
but OK without the [x%8,y%8] reordering)

I have recently make something that use a local
wavelet/quantization/rle/zcompression pipeline that can compress/decompress a
picture and that can be found at

Note that 99% of the CPU time is used by the zlib compression stage
=> without the zlib stage this can already work on real time ...
(and this don't use MMX or SSE instructions, so this can be optimised a lot too)

I take a look on the snow/dirac/indeo codecs sides, they are certainly very more
efficients than my "basic local wavelet/quantization/rle/zlib codec" version :)


More information about the ffmpeg-devel mailing list