[Ffmpeg-devel] [PATCH] h264 - loopify some get_cabac calls
Guillaume POIRIER
poirierg
Sun Mar 25 22:09:56 CEST 2007
Hi,
On 3/25/07, Alexander Strange <astrange at ithinksw.com> wrote:
>
> On Mar 24, 2007, at 7:46 PM, Guillaume Poirier wrote:
>
>
> >> There's some more AltiVec code here we'll probably send soon:
> >> http://trac.perian.org/ticket/113
> > I had a quick look at http://trac.perian.org/attachment/ticket/113/
> > altivec_lum.3.diff
> >
> > Even though I imagine this patch isn't yet ready to be submitted,
> > I'd like to ask if the in your opinion, transpose routines can make
> > do without accessing memory (do it all in registers).
>
> They actually do, that patch is just messy enough to hide it.
> The functions transpose4x4 and readVector aren't ever called.
>
> transpose4/6x16 only do memory operations because the initial loads
> and stores are integrated into them.
Ok, I hadn't looked carefully.
> I think the stuff in transpose6x16 can be cleaned up; it should be
> able to use vec_ste instead of copying the result array.
>
> But this is my first time studying it too; I didn't write it.
Ok. Who may that be? I attached a patch that uses these altivec
routines on x264. I looks like they don't produce bit-identical
results as the C version, but maybe it's just because I haven't
modifed what needed to to make them work on x264 environment.
> > Also more cycles could be saved if you take advantage of some known
> > alignments (8-bytes aligned load/store can be made faster than a
> > generic unaligned memory access)....
>
> Hm, doesn't Altivec use the same unaligned load method for both?
> (load x and 15+x, merge them)
Well, if you know the alignement in advance, you don't need to compute
the permute vector, that's all. I doesn't same all that much, just a
tiny bit.
Guillaume
--
Rich, you're forgetting one thing here: *everybody* except you is
stupid.
M?ns Rullg?rd
-------------- next part --------------
A non-text attachment was scrubbed...
Name: x264_deblock.patch
Type: application/octet-stream
Size: 14448 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20070325/818d1b67/attachment.obj>
More information about the ffmpeg-devel
mailing list