[FFmpeg-cvslog] x86inc: Use SSE instead of SSE2 for copying data

Henrik Gramner git at videolan.org
Tue Oct 8 11:06:50 CEST 2013


ffmpeg | branch: master | Henrik Gramner <henrik at gramner.com> | Wed Sep 11 17:49:22 2013 +0200| [63f0d623100bdb0c6081456127f4b6713e83d3db] | committer: Derek Buitenhuis

x86inc: Use SSE instead of SSE2 for copying data

Reduces code size because movaps/movups is one byte
shorter than movdqa/movdqu.

Signed-off-by: Derek Buitenhuis <derek.buitenhuis at gmail.com>

> http://git.videolan.org/gitweb.cgi/ffmpeg.git/?a=commit;h=63f0d623100bdb0c6081456127f4b6713e83d3db
---

 libavutil/x86/x86inc.asm |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/libavutil/x86/x86inc.asm b/libavutil/x86/x86inc.asm
index b6edfd9..0c27c60 100644
--- a/libavutil/x86/x86inc.asm
+++ b/libavutil/x86/x86inc.asm
@@ -436,7 +436,7 @@ DECLARE_REG 14, R15, 120
     %assign %%i xmm_regs_used
     %rep (xmm_regs_used-6)
         %assign %%i %%i-1
-        movdqa [rsp + (%%i-6)*16 + stack_size + (~stack_offset&8)], xmm %+ %%i
+        movaps [rsp + (%%i-6)*16 + stack_size + (~stack_offset&8)], xmm %+ %%i
     %endrep
 %endmacro
 
@@ -454,7 +454,7 @@ DECLARE_REG 14, R15, 120
         %assign %%i xmm_regs_used
         %rep (xmm_regs_used-6)
             %assign %%i %%i-1
-            movdqa xmm %+ %%i, [%1 + (%%i-6)*16+stack_size+(~stack_offset&8)]
+            movaps xmm %+ %%i, [%1 + (%%i-6)*16+stack_size+(~stack_offset&8)]
         %endrep
         %if stack_size_padded == 0
             add %1, (xmm_regs_used-6)*16+16



More information about the ffmpeg-cvslog mailing list