[MPlayer-dev-eng] [PATCH] autoq support for control()
Michael Niedermayer
michaelni at gmx.at
Mon Feb 11 17:38:39 CET 2002
Hi
On Monday 11 February 2002 16:57, Juan J. Sierralta P. wrote:
> On Mon, 2002-02-11 at 09:56, Michael Niedermayer wrote:
> > > Had or has ?
> >
> > had, i fixed it, anything wrong with it?
>
> Nope. It's the original C function ? I remember it was AAN.
the bug was in mpegvideo.c convert_matrix()
[...]
> > imho its faster with the permutations, its simply because these
> > permutations need nearly no time
> > for decoding, there is no extra permutation, the decoder simply puts the
> > coeffs into the correct permutation for the idct, but with a unpermutated
> > idct it still has to do zigzig permutation so there is no speed win here
> > for encoding, its not a big deal either as the quantizer knows whch coeff
> > is the last non zero and so only the coeffs up to the last non zero are
> > being permutated
>
> Ok. I read some comments on mpegvideo.c that the DCT doesn't use the
> permutation so we have to permute in C. But the permutation it's on the
> TODO.
yes, the c version sucks, it should be optimized
btw even the c IDCT needs the permutation, only the simple idct in c does not
and avoiding the permutation during the quantizer will simply move the
permutation into the IDCT where it will be slower to do it so it it really
cannot be avoided, the IDCT simply needs to be permutated its a math problem
not a implementation issue its like wih FFTs they need a permutation afaik or
the will be VERY slow, its just that in c it isnt that obvious that a 8
variables are accesed in different order, but in mmx this is very vissible as
mmx performes operation on several values at once so we cant silently do it
>
> > > BTW. Why SSE couldn't help on DCT/IDCT ? How much time is spent on MMX
> > > DCT scaling things ? Because AFAIK one of the advantges of SSE/SSE2 is
> > > the SIMD on floats.
> >
> > yes but SSE on both the P3 & P4 need 2 cpu cycles to do 1 calculation on
> > 4 floats
> > and mmx needs 1 cpu cycle to do 1 calculation on 4 16-bit shorts, and at
> > least some part of the dct/idct can be done in 16-bit
> > btw. the "more accurate" SSE IDCT could cause problems, because all
> > thouse "shitty" players use integer IDCTs so there could be stripes and
> > green blocks again ...
>
> Note that I don't put MPlayer on the "shitty" players list ;)
> Another questions, I remember that the guys from libmpeg2 posted some
> changes on their IDCT on ffmpeg list after you demonstrated the bias
> problem with their IDCT. As somebody tested if this changes fixed the
> bias problem ? I believe that having a buggy IDCT by default isn't good
> and don't understand why it's used by default.
i never saw any changes to walkens idct...
[...]
Michael
More information about the MPlayer-dev-eng
mailing list