[MPlayer-dev-eng] [PATCH] (new version) AltiVec: dct64 for mp3lib, IMDCT for liba52, detection code

Mon Jan 20 13:06:06 CET 2003

On Mon, 2003-01-20 at 10:20, Romain Dolbeau wrote:

> Dual load + vec_lvsl is OK. It isn't that slow, in practice.
> Unaligned store are the real killers.

Yes. Though the loads are also annoying. For me it's hard to
understand, why it's so complicated to keep the data aligned
as it worked before flawlessly. This hurts all platforms with SIMD
commands.

> My last patch to ffmpeg mistakenly "fixed" the stride. I did
> use to pre-compute it, then put it back in the loop for some
> functions. This is stupid, as if line_size % 16 != 0, then all
> load to 16 bytes-aligned block become wrong anyway.

We should try to get a statement here. I believe that the line_size
is always a multiple of 16 for the formats we're interested in; 
even VirtualDub breaks when this is not the case.

Michael?

> I talked about that to my PhD advisor, he also thinks it's
> not useful to try remove the (potentially) spurious load
> in put_pixels8_x2y_altivec. More code, more branchs, and the
> only gain is to avoid accessing a cache line in 7/32 of
> the call (assuming an even distribution of alignments).
> And this is costly only if the line isn't in the L1.

Oooh, you're writing your thesis about AltiVec in mplayer?

-- 
Servus,
       Daniel
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://lists.mplayerhq.hu/pipermail/mplayer-dev-eng/attachments/20030120/dd3bdc27/attachment.pgp>