[FFmpeg-devel] [PATCH] SSE dct32()
Michael Niedermayer
michaelni
Sun Jun 20 19:51:28 CEST 2010
On Sun, Jun 20, 2010 at 01:12:48PM +0100, M?ns Rullg?rd wrote:
> Vitor Sessak <vitor1001 at gmail.com> writes:
>
> > On 06/20/2010 01:33 PM, M?ns Rullg?rd wrote:
> >> Vitor Sessak<vitor1001 at gmail.com> writes:
> >>
> >>> On 06/20/2010 12:15 PM, M?ns Rullg?rd wrote:
> >>>> Vitor Sessak<vitor1001 at gmail.com> writes:
> >>>>
> >>>>>>> I don't remember seeing a big difference _for the dct32 code_ between in ==
> >>>>>>> out and in != out.
> >>>>>>
> >>>>>> now iam confused, i thought the 3% you quoted was about in ==out vs in!= out
> >>>>>> ?
> >>>>>
> >>>>> No, the 3% slowdown was when converting our general code (using FFT)
> >>>>> to have in != out.
> >>>>
> >>>> And that was due to missed optimisations caused by gcc not knowing
> >>>> that those pointers don't alias each other. Marking them restrict is
> >>>> not good either, since we actually want to pass the same value
> >>>> sometimes.
> >>>
> >>> That and one extra used register.
> >>
> >> So what do we do? I see the following options:
> >>
> >> 1. Change mp3 decoder to work with inplace transform.
> >
> > Looks hard with no speed loss
>
> Just hard or impossible?
hard, not impossible
just consider that dct32() trashes its input array
Either way, the in != out thing is not a big issue if its not slower
what is a big issue is that high level optimizations have to be done
before asm optimisations
is our dct32() code optimal? If i didnt miscount mp3lib does 4 butterflies
less but i could have miscounted. Also our dct32() should be benchmarked
against dct32() codes from other mp3 decoders to make sure our highlevel
code is ok before one starts writing asm for it
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
There will always be a question for which you do not know the correct awnser.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20100620/f50f07d4/attachment.pgp>
More information about the ffmpeg-devel
mailing list