[Ffmpeg-devel] Re: Using Intel's fDCT

Sun Nov 20 19:21:11 CET 2005

 <g> writes:
> 
> Perhaps I have to add a new permutation step to the fDCT function before 
> quantisation when using Intel's fDCT?
> 
> Can anyone explain what is going on?

The Intel fDCT is noticeably faster than ff_fdct_sse2 so there is evidently 
some improvements that could be made to ff_fdct_sse2. However, ff_fdct_sse2 
doesn't appear to do a straightforward transform and I couldn't find any 
documentation or comments to explain what is going on.

To compare it with Inte's fDCT I fed in the following data:

DCTELEM input[64] = { 
0,0,0,0,0,0,0,0,
0,1,1,1,1,1,1,1,
0,1,2,2,2,2,2,2,
0,1,2,3,3,3,3,3,
0,1,2,3,4,4,4,4,
0,1,2,3,4,5,5,5,
0,1,2,3,4,5,6,6,
0,1,2,3,4,5,6,7 };

The results of fDCT by Intel's routine were:

Intel
18 -9 -2 -1 0 0 0 0 
-9 7 0 0 0 0 0 0 
-2 0 2 0 0 0 0 0 
-1 0 0 1 0 0 0 0 
0 0 0 0 1 0 0 0 
0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 

And the results using ffmpeg's ff_fdct_sse2 were:

ffmpeg
140 -73 -18 -8 -4 -2 -1 -1 
-72 53 0 1 0 0 0 0 
-17 0 14 0 0 0 0 0 
-8 0 0 7 0 0 0 0 
-4 0 0 0 4 0 0 0 
-3 0 0 0 0 3 0 0 
-1 0 0 0 1 0 2 0 
-1 0 0 0 0 0 0 2

Presumably the quantization stage is also not what I might expect as it must 
be somehow compensating for the strange behaviour of ff_fdct_sse2.

Can someone explain all this to me?

g.