[Ffmpeg-devel] [PATCH] simple_idct_armv5te optimization

Michael Niedermayer michaelni
Sat Sep 30 19:41:19 CEST 2006


Hi

On Sat, Sep 30, 2006 at 07:09:22PM +0300, Siarhei Siamashka wrote:
> Hello All,
> 
> Here is some patch for improving simple idct performance for armv5te. It
> contains rows processing almost completely rewritten taking instructions
> sheduling into account and avoiding any redundant data load operations (almost
> all data is processed in registers, there is even one extra mostly unused 'lr'
> register left :)).
> 
> For benchmarking I have used a modified dct-test on Nokia 770:
> ./dct-test -i
> ...
> IDCT INT: err_inf=1 err2=0.01318437 syserr=0.00285000 maxout=266 
> blockSumErr=64
> IDCT INT: 136.6 kdct/s
> ...
> IDCT SIMPLE-C: err_inf=1 err2=0.00667969 syserr=0.00130000 maxout=266 
> blockSumErr=64
> IDCT SIMPLE-C: 103.6 kdct/s
> ...
> IDCT SIMPLE-ARM: err_inf=1 err2=0.00667500 syserr=0.00130000 maxout=266 
> blockSumErr=64
> IDCT SIMPLE-ARM: 140.8 kdct/s
> ...
> IDCT SIMPLE-ARMv5TE: err_inf=1 err2=0.00667969 syserr=0.00130000 maxout=266 
> blockSumErr=64
> IDCT SIMPLE-ARMv5TE: 153.4 kdct/s
> 
> After patch:
> IDCT SIMPLE-ARMv5TE: err_inf=1 err2=0.00667969 syserr=0.00130000 maxout=266 
> blockSumErr=64
> IDCT SIMPLE-ARMv5TE: 158.8 kdct/s

patch looks ok (assuming its also faster with an actual video, instead of
just dct-test)

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

In the past you could go to a library and read, borrow or copy any book
Today you'd get arrested for mere telling someone where the library is




More information about the ffmpeg-devel mailing list