[FFmpeg-devel] [PATCH] SSE dct32()

Michael Niedermayer michaelni
Sun Jun 20 19:51:28 CEST 2010


On Sun, Jun 20, 2010 at 01:12:48PM +0100, M?ns Rullg?rd wrote:
> Vitor Sessak <vitor1001 at gmail.com> writes:
> 
> > On 06/20/2010 01:33 PM, M?ns Rullg?rd wrote:
> >> Vitor Sessak<vitor1001 at gmail.com>  writes:
> >>
> >>> On 06/20/2010 12:15 PM, M?ns Rullg?rd wrote:
> >>>> Vitor Sessak<vitor1001 at gmail.com>   writes:
> >>>>
> >>>>>>> I don't remember seeing a big difference _for the dct32 code_ between in ==
> >>>>>>> out and in != out.
> >>>>>>
> >>>>>> now iam confused, i thought the 3% you quoted was about in ==out vs in!= out
> >>>>>> ?
> >>>>>
> >>>>> No, the 3% slowdown was when converting our general code (using FFT)
> >>>>> to have in != out.
> >>>>
> >>>> And that was due to missed optimisations caused by gcc not knowing
> >>>> that those pointers don't alias each other.  Marking them restrict is
> >>>> not good either, since we actually want to pass the same value
> >>>> sometimes.
> >>>
> >>> That and one extra used register.
> >>
> >> So what do we do?  I see the following options:
> >>
> >> 1. Change mp3 decoder to work with inplace transform.
> >
> > Looks hard with no speed loss
> 
> Just hard or impossible?

hard, not impossible
just consider that dct32() trashes its input array

Either way, the in != out thing is not a big issue if its not slower
what is a big issue is that high level optimizations have to be done
before asm optimisations

is our dct32() code optimal? If i didnt miscount mp3lib does 4 butterflies
less but i could have miscounted. Also our dct32() should be benchmarked
against dct32() codes from other mp3 decoders to make sure our highlevel
code is ok before one starts writing asm for it

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

There will always be a question for which you do not know the correct awnser.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20100620/f50f07d4/attachment.pgp>



More information about the ffmpeg-devel mailing list