[FFmpeg-devel] [PATCH] Fix mm_flags, mm_support for ARM

Siarhei Siamashka siarhei.siamashka
Sat Jun 28 09:31:42 CEST 2008

On Saturday 28 June 2008, M?ns Rullg?rd wrote:
> Michael Niedermayer <michaelni at gmx.at> writes:
> > Do we have someone who has a arm cpu and can look into the above issue?
> I know exactly why it's different.  In simple_idct.c, the column
> transform contains these lines:
>         /* XXX: I did that only to give same values as previous code */
>         a0 = W4 * (col[8*0] + ((1<<(COL_SHIFT-1))/W4));
> It's simpler to code that as a0 = W4 * col[0] + (1 << (COL_SHIFT-1)).
> Thinking about it, it only takes one more instruction on NEON, and
> I've fixed that in my tree.  With a little luck, the extra instruction
> can be dual-issued with something else.

This part does not have any extra overhead in my finetuned version 

  ldr    v1, xxx         /* v1 = (((1<<(COL_SHIFT-1))/W4)*W4) */
  [some unrelated instructions to hide load latency]
  smlatt v2, a2, v4, v1  /* A0t = W4 * (col_t[0] + ((1<<(COL_SHIFT-1))/W4)) */

There is no reason why ARMv6 or NEON should have overhead too. So getting
bit-identical results to C simple_idct is possible without sacrificing 

> > Ideally would be the authors who claimed the code to be identical to the
> > C code ...
> I wrote the ARMv6 version, but I never made any such claim.  In fact,
> I believe I mentioned at the time that there was a slight difference.
> > If we have noone then we will likely have to disable these IDCTs. I do
> > not want to create files that turn green and pink unless they are played
> > on an ARM cpu ...
> I don't think the ARM CPUs where these apply will be used mostly for
> playback, not encoding, and on those machines every cycle counts.

Yes, that was one of the reasons why I did not strongly insist on disabling
j_rev_dct_ARM that time (people could get a severe performance regressions 
and complain about it) :)

In any case, ARMv6 idct still needs heavy optimizations, it is not very fast
(on its target devices with ARM11 CPUs of course).

Best regards,
Siarhei Siamashka

More information about the ffmpeg-devel mailing list