[FFmpeg-devel] [PATCH] h264 luma interpolation 8x8 for altivec
Luca Barbato
lu_zero
Mon Jun 18 19:32:20 CEST 2007
Mauricio Alvarez wrote:
> Hi All,
>
> Here I'm sending a patch that adds support for luma interpolation of 8x8
> blocks using Altivec. I have tested it on a G5 machine (Linux
> 2.6.15-1.2054_FC5, gcc 4.1.1) using some videos that I use for h264
> research [1]). The resulting video files are md5 identical to the generated
> by the original ffmpeg.
First, thank you for your effort.
> I made an analysis of this alignment and found that the destination result
> is always aligned, based on that it is possible to remove the re-alignment
> code at each store. I send a separate patch for this. Also It have been
> tested and the md5 check passed OK.
Good
> Additionally I'm working on Altivec functions for doing the luma
> interpolation for non-square blocks: 16x8, 8x16, 8x4 and 4x8. The
> implementation of the functions is very easy. My question is how to
> integrate them with DSPContext structure. An option could be to add a
> position to the XXX_pixels_tab[][] structure, like this
> index | size
> 0: 16x16
> 1: 8x8
> 2: 4x4
> 3: 16x8
> 4: 8x16
> 5: 8x4
> 6: 4x8
I'd like to know the opinion of the other people involved (x86 hackers
I'm speacking to you ^^)
> Suggestions on this issue are welcome.
Hm I'd like to comment your patches inlined but looks like my more than
often idiotic client doesn't understand that text/* should be put
inline... (hi thunderbird)
first, the patch is about 700 lines, a bit big, so I'll be slow
commenting, maybe you should try to split it in pieces.
> +static void PREFIX_h264_qpel8_h_lowpass_altivec(uint8_t * dst,
> uint8_t * src, int dstStride, int srcStride) {
> + POWERPC_PERF_DECLARE(PREFIX_h264_qpel8_h_lowpass_num, 1);
DO you really use this? I'm actively deprecating it since last year and
probably I'll remove it anytime soon if nobody screams, I think dtrace
on macosx and oprofile on linux cover all our performance counting needs
> \
> static void OPNAME ## h264_qpel ## SIZE ## _mc10_ ## CODETYPE(uint8_t
>*dst, uint8_t *src, int stride){ \
>- DECLARE_ALIGNED_16(uint8_t, half[SIZE*SIZE]);\
>- put_h264_qpel ## SIZE ## _h_lowpass_ ## CODETYPE(half, src, SIZE,
>stride);\
>+ DECLARE_ALIGNED_16(uint8_t, half[16*16]);\
>+ put_h264_qpel ## SIZE ## _h_lowpass_ ## CODETYPE(half, src, 16,
>stride);\
> OPNAME ## pixels ## SIZE ## _l2_ ## CODETYPE(dst, src, half,
>stride, stride, SIZE);\
doesn't look right
>- if ( (unsigned long) dst & 0x0f) {
...
>+ if (((unsigned long)dst) % 16 == 0) {
hm..
I guess that's all for now...
--
Luca Barbato
Gentoo/linux Gentoo/PPC
http://dev.gentoo.org/~lu_zero
More information about the ffmpeg-devel
mailing list