[FFmpeg-devel] [PATCH] Fix mm_flags, mm_support for ARM

Michael Niedermayer michaelni
Sat Jun 28 01:52:00 CEST 2008


On Fri, Jun 27, 2008 at 08:49:08PM +0100, M?ns Rullg?rd wrote:
> Siarhei Siamashka <siarhei.siamashka at gmail.com> writes:
> 
> > On Friday 27 June 2008, matthieu castet wrote:
> >> Benoit Fouet wrote:
> >> > Michael Niedermayer wrote:
> >> >> On Thu, Jun 26, 2008 at 10:45:20PM +0200, matthieu castet wrote:
> >> >
> >> > Matthieu, could you please resend the complete patch ?
> >>
> >> I attach them as 2 separate patch.
> >> fix_dct-test-arm.diff : allow to build dct-test on arm
> >> arm-icdt-test.diff : add some arm idct (there are other version that
> >> could be added, but I have no hardware to test).
> >
> >> +#ifdef ARCH_ARMV4L
> >> + ?//{"IDCT_ARM", ? ? ? ?1, j_rev_dct_ARM, ? ? ?idct, MMX_PERM},
> >> /* should > be MMX_PERM perm, but produce strange result */ 
> >
> > Yes, 'j_rev_dct_ARM' is broken, it produces quite noticeable
> > artefacts on decoding video. I think I already reported this issue
> > here long ago. But as it is armv4 only, I was not interested in
> > investigating this problem further...
> 
> It is indeed in pretty poor shape.  It's pretty fast though, if that's
> any excuse.
> 
> For completeness, here's the output of dct-test -i including all the
> ARM IDCT variants on a 500MHz Cortex-A8.  It is interesting to note
> that the ones targeting older ARM cores are significantly slower than
> plain C on the Cortex.

dct-test is not a good benchmark, for serious benchmarking real data
should be used. Many idcts contain checks for zero elements ...
You can also see this difference between dct-test -i 0 and -i 1

[...]

>    -7    15    -3     8    16    19     5    19 
>   -23     1   -15    10    26     4   -12    -7 
>     2   -22    -5    -5    -4    10     4    12 
>     4    -4   -15    -5    -2    -5     8    -6 
>    10    17    -7    11     7    11    -1    -2 
>   -17     1   -25    -6     1    -2    19    12 
>    -3     7     3    21     3    -8    12   -24 
>    21    -4    13     1    -5    10     8    -5 
> IDCT SIMPLE-C: err_inf=1 err2=0.00667969 syserr=0.00130000 maxout=266 blockSumErr=64
> IDCT SIMPLE-C: 452.7 kdct/s
> 
>    -6    15    -2     9    16    20     5    21 
>   -23     1   -14    10    26     4   -11    -7 
>     2   -21    -5    -5    -4    10     4    12 
>     5    -4   -14    -2    -2    -4     9    -5 
>    10    17    -7    12     7    11    -1    -2 
>   -17     2   -23    -6     2    -2    19    12 
>    -2     8     5    22     3    -8    12   -24 
>    23    -2    13     1    -5    11     8    -3 
> IDCT SIMPLE-ARM: err_inf=1 err2=0.00667500 syserr=0.00130000 maxout=266 blockSumErr=64
> IDCT SIMPLE-ARM: 369.6 kdct/s
[...]
>    -6    15    -2     9    16    20     5    21 
>   -23     1   -14    10    26     4   -11    -7 
>     2   -21    -5    -5    -4    10     4    12 
>     5    -4   -14    -2    -2    -4     9    -5 
>    10    17    -7    12     7    11    -1    -2 
>   -17     2   -23    -6     2    -2    19    12 
>    -2     8     5    22     3    -8    12   -24 
>    23    -2    13     1    -5    11     8    -3 
> IDCT SIMPLE-ARMV6: err_inf=1 err2=0.00667500 syserr=0.00130000 maxout=266 blockSumErr=64
> IDCT SIMPLE-ARMV6: 512.4 kdct/s
> 
>    -6    15    -2     9    16    20     5    21 
>   -23     1   -14    10    26     4   -11    -7 
>     2   -21    -5    -5    -4    10     4    12 
>     5    -4   -14    -2    -2    -4     9    -5 
>    10    17    -7    12     7    11    -1    -2 
>   -17     2   -23    -6     2    -2    19    12 
>    -2     8     5    22     3    -8    12   -24 
>    23    -2    13     1    -5    11     8    -3 
> IDCT SIMPLE-NEON: err_inf=1 err2=0.00667500 syserr=0.00130000 maxout=266 blockSumErr=64
> IDCT SIMPLE-NEON: 783.9 kdct/s

These IDCTs are all buggy, they differ from the C simple idct.
As has been discussed on the ML already i belive it was 
(skal, me and alexander).
We should try hard to avoid introducing a new IDCT with different "error
landscape".
Do we have someone who has a arm cpu and can look into the above issue?
Ideally would be the authors who claimed the code to be identical to the
C code ...
If we have noone then we will likely have to disable these IDCTs. I do not
want to create files that turn green and pink unless they are played on
an ARM cpu ...

Your patch is ok of course ...

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Republics decline into democracies and democracies degenerate into
despotisms. -- Aristotle
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20080628/121d8bd9/attachment.pgp>



More information about the ffmpeg-devel mailing list