[FFmpeg-cvslog] altivec: perform an explicit unaligned load

Kostya Shishkov git at videolan.org
Fri Aug 16 14:29:37 CEST 2013


ffmpeg | branch: master | Kostya Shishkov <kostya.shishkov at gmail.com> | Wed Aug 14 15:28:05 2013 -0400| [f399e406af0c8507bb3ab7b94995ad7b8f409093] | committer: Martin Storsjö

altivec: perform an explicit unaligned load

Implicit vector loads on POWER7 hardware can use the VSX
instruction set instead of classic Altivec/VMX. Let's force
a VMX load in this case.

Signed-off-by: Martin Storsjö <martin at martin.st>

> http://git.videolan.org/gitweb.cgi/ffmpeg.git/?a=commit;h=f399e406af0c8507bb3ab7b94995ad7b8f409093
---

 libavcodec/ppc/int_altivec.c |    4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/libavcodec/ppc/int_altivec.c b/libavcodec/ppc/int_altivec.c
index 8357ca7..38ec99b 100644
--- a/libavcodec/ppc/int_altivec.c
+++ b/libavcodec/ppc/int_altivec.c
@@ -84,14 +84,12 @@ static int32_t scalarproduct_int16_altivec(const int16_t *v1, const int16_t *v2,
 {
     int i;
     LOAD_ZERO;
-    const vec_s16 *pv;
     register vec_s16 vec1;
     register vec_s32 res = vec_splat_s32(0), t;
     int32_t ires;
 
     for(i = 0; i < order; i += 8){
-        pv = (const vec_s16*)v1;
-        vec1 = vec_perm(pv[0], pv[1], vec_lvsl(0, v1));
+        vec1 = vec_unaligned_load(v1);
         t = vec_msum(vec1, vec_ld(0, v2), zero_s32v);
         res = vec_sums(t, res);
         v1 += 8;



More information about the ffmpeg-cvslog mailing list