[FFmpeg-devel] [FFmpeg-cvslog] ppc: dsputil: Merge some declarations and initializations

Clément Bœsch u at pkh.me
Fri Mar 21 10:44:41 CET 2014


On Thu, Mar 20, 2014 at 09:57:17PM +0100, Diego Biurrun wrote:
> ffmpeg | branch: master | Diego Biurrun <diego at biurrun.de> | Wed Jan 15 14:36:28 2014 +0100| [b7d24fd4b2213104c001ed504074495568600b9c] | committer: Diego Biurrun
> 
> ppc: dsputil: Merge some declarations and initializations
> 
> > http://git.videolan.org/gitweb.cgi/ffmpeg.git/?a=commit;h=b7d24fd4b2213104c001ed504074495568600b9c
> ---
> 
>  libavcodec/ppc/dsputil_altivec.c |  403 +++++++++++++++++---------------------
>  libavcodec/ppc/dsputil_ppc.c     |    9 +-
>  libavcodec/ppc/fdct_altivec.c    |    3 +-
>  libavcodec/ppc/gmc_altivec.c     |   31 ++-
>  libavcodec/ppc/idct_altivec.c    |   37 ++--
>  libavcodec/ppc/int_altivec.c     |    6 +-
>  6 files changed, 219 insertions(+), 270 deletions(-)
> 
> diff --git a/libavcodec/ppc/dsputil_altivec.c b/libavcodec/ppc/dsputil_altivec.c
> index 2091023..a8985fd 100644
> --- a/libavcodec/ppc/dsputil_altivec.c
> +++ b/libavcodec/ppc/dsputil_altivec.c
[...]
> @@ -903,6 +862,33 @@ static int hadamard8_diff16x8_altivec(/* MpegEncContext */ void *s, uint8_t *dst
>          register vector signed short line3C = vec_add(line3B, line7B);
>          register vector signed short line7C = vec_sub(line3B, line7B);
>  
> +        register vector signed short line0S = vec_add(temp0S, temp1S);
> +        register vector signed short line1S = vec_sub(temp0S, temp1S);
> +        register vector signed short line2S = vec_add(temp2S, temp3S);
> +        register vector signed short line3S = vec_sub(temp2S, temp3S);
> +        register vector signed short line4S = vec_add(temp4S, temp5S);
> +        register vector signed short line5S = vec_sub(temp4S, temp5S);
> +        register vector signed short line6S = vec_add(temp6S, temp7S);
> +        register vector signed short line7S = vec_sub(temp6S, temp7S);
> +
> +        register vector signed short line0BS = vec_add(line0S, line2S);
> +        register vector signed short line2BS = vec_sub(line0S, line2S);
> +        register vector signed short line1BS = vec_add(line1S, line3S);
> +        register vector signed short line3BS = vec_sub(line1S, line3S);
> +        register vector signed short line4BS = vec_add(line4S, line6S);
> +        register vector signed short line6BS = vec_sub(line4S, line6S);
> +        register vector signed short line5BS = vec_add(line5S, line7S);
> +        register vector signed short line7BS = vec_sub(line5S, line7S);
> +
> +        register vector signed short line0CS = vec_add(line0BS, line4BS);
> +        register vector signed short line4CS = vec_sub(line0BS, line4BS);
> +        register vector signed short line1CS = vec_add(line1BS, line5BS);
> +        register vector signed short line5CS = vec_sub(line1BS, line5BS);
> +        register vector signed short line2CS = vec_add(line2BS, line6BS);
> +        register vector signed short line6CS = vec_sub(line2BS, line6BS);
> +        register vector signed short line3CS = vec_add(line3BS, line7BS);
> +        register vector signed short line7CS = vec_sub(line3BS, line7BS);
> +
>          vsum = vec_sum4s(vec_abs(line0C), vec_splat_s32(0));
>          vsum = vec_sum4s(vec_abs(line1C), vsum);
>          vsum = vec_sum4s(vec_abs(line2C), vsum);
> @@ -912,33 +898,6 @@ static int hadamard8_diff16x8_altivec(/* MpegEncContext */ void *s, uint8_t *dst
>          vsum = vec_sum4s(vec_abs(line6C), vsum);
>          vsum = vec_sum4s(vec_abs(line7C), vsum);
>  
> -        line0S = vec_add(temp0S, temp1S);
> -        line1S = vec_sub(temp0S, temp1S);
> -        line2S = vec_add(temp2S, temp3S);
> -        line3S = vec_sub(temp2S, temp3S);
> -        line4S = vec_add(temp4S, temp5S);
> -        line5S = vec_sub(temp4S, temp5S);
> -        line6S = vec_add(temp6S, temp7S);
> -        line7S = vec_sub(temp6S, temp7S);
> -
> -        line0BS = vec_add(line0S, line2S);
> -        line2BS = vec_sub(line0S, line2S);
> -        line1BS = vec_add(line1S, line3S);
> -        line3BS = vec_sub(line1S, line3S);
> -        line4BS = vec_add(line4S, line6S);
> -        line6BS = vec_sub(line4S, line6S);
> -        line5BS = vec_add(line5S, line7S);
> -        line7BS = vec_sub(line5S, line7S);
> -
> -        line0CS = vec_add(line0BS, line4BS);
> -        line4CS = vec_sub(line0BS, line4BS);
> -        line1CS = vec_add(line1BS, line5BS);
> -        line5CS = vec_sub(line1BS, line5BS);
> -        line2CS = vec_add(line2BS, line6BS);
> -        line6CS = vec_sub(line2BS, line6BS);
> -        line3CS = vec_add(line3BS, line7BS);
> -        line7CS = vec_sub(line3BS, line7BS);
> -
>          vsum = vec_sum4s(vec_abs(line0CS), vsum);
>          vsum = vec_sum4s(vec_abs(line1CS), vsum);
>          vsum = vec_sum4s(vec_abs(line2CS), vsum);

Is it OK to move all the "register" initializations on top when usage is
not immediately required? Won't that stress a bit the compiler and make
it do nasty thing with the stack? Maybe it's smart enough, but I would
guess this wasn't tested.

[...]

-- 
Clément B.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 490 bytes
Desc: not available
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20140321/b38a2f62/attachment.asc>


More information about the ffmpeg-devel mailing list