[FFmpeg-devel] [PATCH] Some ARM VFP optimizations (vector_fmul, vector_fmul_reverse, float_to_int16)
Sun Apr 27 13:26:02 CEST 2008
On Monday 21 April 2008, Michael Niedermayer wrote:
> > > To awnser 3.
> > > huge speedloss, and thats why this isnt a solution
> > Where did you get this idea? Actually using current FFmpeg implementation
> > of ARMv5TE IDCT is a huge speedloss :)
> > The proposed upgrade is not perfect, but it still can be improved
> > further. And it will provide performance improvement, and provide it
> > right now. Before this hardware (ARMv5TE is already old) gets completely
> > outdated and abandoned by everyone...
> Well if you insist on this messy stack realign in the innermost loop then
> iam fine with it, if you provide some benchmarks (with the realign enabled)
> which are faster than the current code.
Technically speaking, it is definitely NOT the innermost loop as we get
some loops inside IDCT function too. But if you keep insisting that
it is "messy", I'm fine with that and even agree :) I would surely
prefer if there was no need for this workaround at all.
I'll submit the latest revision of IDCT patch and repost the benchmark
results (ARM9E and ARM11) in its own separate topic once we get some
progress with VFP optimizations.
More information about the ffmpeg-devel