[Ffmpeg-devel] [PATCH] h264 - loopify some get_cabac calls
Alexander Strange
astrange
Sun Mar 25 04:07:46 CEST 2007
On Mar 24, 2007, at 7:46 PM, Guillaume Poirier wrote:
>> There's some more AltiVec code here we'll probably send soon:
>> http://trac.perian.org/ticket/113
> I had a quick look at http://trac.perian.org/attachment/ticket/113/
> altivec_lum.3.diff
>
> Even though I imagine this patch isn't yet ready to be submitted,
> I'd like to ask if the in your opinion, transpose routines can make
> do without accessing memory (do it all in registers).
They actually do, that patch is just messy enough to hide it.
The functions transpose4x4 and readVector aren't ever called.
transpose4/6x16 only do memory operations because the initial loads
and stores are integrated into them.
I think the stuff in transpose6x16 can be cleaned up; it should be
able to use vec_ste instead of copying the result array.
But this is my first time studying it too; I didn't write it.
> Also more cycles could be saved if you take advantage of some known
> alignments (8-bytes aligned load/store can be made faster than a
> generic unaligned memory access)....
Hm, doesn't Altivec use the same unaligned load method for both?
(load x and 15+x, merge them)
>
> I do realize that this patch isn't meant for submission yet, I just
> wanted to give some kind of feedback.
>
>
> Guillaume
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at mplayerhq.hu
> http://lists.mplayerhq.hu/mailman/listinfo/ffmpeg-devel
More information about the ffmpeg-devel
mailing list