[FFmpeg-devel] [PATCH] VP8: correctly use optimal epel functions for splitmv mode
David Conrad
lessen42
Mon Jun 28 09:04:13 CEST 2010
On Jun 27, 2010, at 1:57 PM, Ronald S. Bultje wrote:
> Hi,
>
> On Sat, Jun 26, 2010 at 8:19 PM, Ronald S. Bultje <rsbultje at gmail.com> wrote:
>> currently, we apply MC/epel for splitmv coding as 4x4 subblocks (of
>> 4x4px each) in the 16x16px MB. This is suboptimal, because the MVs are
>> actually shared between multiple subblocks, so applying epel in
>> 16x8/8x16/8x8 would be more optimal, particularly if we use SSE2/SSSE3
>> optimizations.
>>
>> The attached patch tries to improve the situation.
>>
>> Once the SSE2/MMX patches are applied, this leads to about 10% speedup
>> for splitmv MBs (5937 to 5486 cycles per whole splitmv-MB for sample
>> 15 in the vector testsuite). Of course this depends on the coding of
>> the MB and thus on the sample. With SSSE3 it probably leads to even
>> better speedups, but I can't test that because my CPU is old.
>
> New patch against SVN after David's bilinear filter addition.
>
> Ronald
> Index: ffmpeg-svn/libavcodec/vp8.c
> ===================================================================
> --- ffmpeg-svn.orig/libavcodec/vp8.c 2010-06-26 21:50:39.000000000 -0400
> +++ ffmpeg-svn/libavcodec/vp8.c 2010-06-27 13:56:47.000000000 -0400
> @@ -943,6 +943,38 @@
> mc_func[my_idx][mx_idx](dst, linesize, src, linesize, block_h, mx, my);
> }
>
> +static void vp8_mc_part(VP8Context *s, uint8_t *dst[3], AVFrame *ref_frame,
> + int x_off, int y_off, int bx_off, int by_off,
> + int block_w, int block_h,
> + int width, int height, VP56mv *mv)
inline
(someone, maybe me, should experiment with manually forcing inlining stuff in vp8)
> +{
> + VP56mv uvmv = *mv;
> +
> + /* Y */
> + vp8_mc(s, 1, dst[0] + by_off * s->linesize + bx_off,
> + ref_frame->data[0], mv, x_off + bx_off, y_off + by_off,
> + block_w, block_h, width, height, s->linesize,
> + s->put_pixels_tab[block_w == 8]);
> +
> + /* U/V */
> + if (s->profile == 3) {
> + uvmv.x &= ~7;
> + uvmv.y &= ~7;
> + }
> + x_off >>= 1; y_off >>= 1;
> + bx_off >>= 1; by_off >>= 1;
> + width >>= 1; height >>= 1;
> + block_w >>= 1; block_h >>= 1;
> + vp8_mc(s, 0, dst[1] + by_off * s->uvlinesize + bx_off,
> + ref_frame->data[1], &uvmv, x_off + bx_off, y_off + by_off,
> + block_w, block_h, width, height, s->uvlinesize,
> + s->put_pixels_tab[1 + (block_w == 4)]);
> + vp8_mc(s, 0, dst[2] + by_off * s->uvlinesize + bx_off,
> + ref_frame->data[2], &uvmv, x_off + bx_off, y_off + by_off,
> + block_w, block_h, width, height, s->uvlinesize,
> + s->put_pixels_tab[1 + (block_w == 4)]);
>
> @@ -112,7 +119,7 @@
> { -6, -7 } // '110', '111'
> };
>
> -static const uint8_t vp8_mbsplits[4][16] = {
> +static const uint8_t vp8_mbsplits[5][16] = {
> { 0, 0, 0, 0, 0, 0, 0, 0,
> 1, 1, 1, 1, 1, 1, 1, 1 },
> { 0, 0, 1, 1, 0, 0, 1, 1,
> @@ -120,7 +127,9 @@
> { 0, 0, 1, 1, 0, 0, 1, 1,
> 2, 2, 3, 3, 2, 2, 3, 3 },
> { 0, 1, 2, 3, 4, 5, 6, 7,
> - 8, 9, 10, 11, 12, 13, 14, 15 }
> + 8, 9, 10, 11, 12, 13, 14, 15 },
> + { 0, 0, 0, 0, 0, 0, 0, 0,
> + 0, 0, 0, 0, 0, 0, 0, 0 }
> };
>
> static const uint8_t vp8_mbfirstidx[4][16] = {
Is this related?
OK otherwise
More information about the ffmpeg-devel
mailing list