[FFmpeg-devel] [PATCH] ARM: NEON optimised simple_idct
Luca Barbato
lu_zero
Tue Aug 26 13:06:13 CEST 2008
M?ns Rullg?rd wrote:
> Michael Niedermayer <michaelni at gmx.at> writes:
>
>> On Mon, Aug 25, 2008 at 03:53:29PM +0100, M?ns Rullg?rd wrote:
>>> Michael Niedermayer <michaelni at gmx.at> writes:
>>>
>>>> On Mon, Aug 25, 2008 at 04:06:33AM +0100, Mans Rullgard wrote:
>>>>> ---
>>>>> libavcodec/Makefile | 2 +
>>>>> libavcodec/armv4l/dsputil_arm.c | 15 ++
>>>>> libavcodec/armv4l/simple_idct_neon.S | 383 ++++++++++++++++++++++++++++++++++
>>>>> libavcodec/avcodec.h | 1 +
>>>>> libavcodec/utils.c | 1 +
>>>>> 5 files changed, 402 insertions(+), 0 deletions(-)
>>>>> create mode 100644 libavcodec/armv4l/simple_idct_neon.S
>>>>>
>>>> is this idct binary identical in output to the C/MMX simple idct?
>>> Yes.
>>>
>>>>> +#ifdef HAVE_NEON
>>>>> + } else if (idct_algo==FF_IDCT_SIMPLENEON){
>>>>> + c->idct_put= ff_simple_idct_put_neon;
>>>>> + c->idct_add= ff_simple_idct_add_neon;
>>>>> + c->idct = ff_simple_idct_neon;
>>>>> + c->idct_permutation_type = FF_NO_IDCT_PERM;
>>>>> +#endif
>>>> I do not know neon at all but, ive never seen a SIMD instruction set for
>>>> which the identity permutation would have been optimal.
>>>>
>>>> Also i suspect that the MMX simple idct is a better basis from which to
>>>> write other SIMD variants of the simple idct than the C one.
>>> I can't read mmx code.
Try the altivec one, should be easy to understand.
lu
--
Luca Barbato
Gentoo Council Member
Gentoo/linux Gentoo/PPC
http://dev.gentoo.org/~lu_zero
More information about the ffmpeg-devel
mailing list