[FFmpeg-devel] [PATCH] VP8 MMX optimizations (MC and IDCT dc_add)
Måns Rullgård
mans
Wed Jun 23 13:15:25 CEST 2010
Jason Garrett-Glaser <darkshikari at gmail.com> writes:
> +static void ff_put_vp8_epel16_h ## TAPNUMX ## v ## TAPNUMY ## _ ## INSTR( \
> + uint8_t *dst, \
> + uint8_t *src, \
> + int stride, int height, \
> + int mx, int my) \
> +{ \
> + uint8_t tmp_arr[stride * (16 + TAPNUMY - 1)], \
This is insane. Not only is it a VLA, which is bad in itself, it's a
HUGE one. For an HD video, it will be roughly 40k, far more than
should go on the stack. You're also using only a few bytes of this.
> + *tmp = tmp_arr + stride * (TAPNUMY / 2 - 1); \
> + \
> + ff_put_vp8_epel16_h ## TAPNUMX ## _ ##INSTR(tmp_arr, \
> + src - stride * (TAPNUMY / 2 - 1), \
> + stride, \
> + height + TAPNUMY - 1, mx, my); \
> + ff_put_vp8_epel16_v ## TAPNUMY ## _ ##INSTR(dst, tmp, stride, \
> + height, mx, my); \
> +}
Change these functions to take separate source and dest strides, and
make the temp array a sensible size. Aligning the temp array is
probably a good idea too.
--
M?ns Rullg?rd
mans at mansr.com
More information about the ffmpeg-devel
mailing list