[FFmpeg-devel] [PATCH 1/2] avcodec: loongson optimized h264pred with mmi v2

Ronald S. Bultje rsbultje at gmail.com
Wed Aug 5 23:29:58 CEST 2015


Hi,

On Tue, Aug 4, 2015 at 8:05 AM, 周晓勇 <zhouxiaoyong at loongson.cn> wrote:

> From 71478e642fac00b12b313723ee83acdfef732fd1 Mon Sep 17 00:00:00 2001
> From: ZhouXiaoyong <zhouxiaoyong at loongson.cn>
> Date: Tue, 4 Aug 2015 16:28:02 +0800
> Subject: [PATCH 1/2] avcodec: loongson optimized h264pred with mmi v2
>
>
> Signed-off-by: ZhouXiaoyong <zhouxiaoyong at loongson.cn>
> ---
>  libavcodec/mips/h264pred_init_mips.c |   1 -
>  libavcodec/mips/h264pred_mips.h      |   7 +-
>  libavcodec/mips/h264pred_mmi.c       | 459
> +++++++++++++++++------------------
>  3 files changed, 226 insertions(+), 241 deletions(-)

 [..]

> void ff_pred16x16_vertical_8_mmi(uint8_t *src, ptrdiff_t stride)
>  {
>      __asm__ volatile (
> -        "dsubu $2, %0, %1                   \r\n"
> -        "daddu $3, %0, $0                   \r\n"
> -        "ldl $4, 7($2)                      \r\n"
> -        "ldr $4, 0($2)                      \r\n"
> -        "ldl $5, 15($2)                     \r\n"
> -        "ldr $5, 8($2)                      \r\n"
> -        "dli $6, 0x10                       \r\n"
> +        "dli $8, 16                         \r\n"
> +        "gsldlc1 $f2, 7(%[srcA])            \r\n"
> +        "gsldrc1 $f2, 0(%[srcA])            \r\n"
> +        "gsldlc1 $f4, 15(%[srcA])           \r\n"
> +        "gsldrc1 $f4, 8(%[srcA])            \r\n"
>          "1:                                 \r\n"
> -        "sdl $4, 7($3)                      \r\n"
> -        "sdr $4, 0($3)                      \r\n"
> -        "sdl $5, 15($3)                     \r\n"
> -        "sdr $5, 8($3)                      \r\n"
> -        "daddu $3, %1                       \r\n"
> -        "daddiu $6, -1                      \r\n"
> -        "bnez $6, 1b                        \r\n"
> -        ::"r"(src),"r"(stride)
> -        : "$2","$3","$4","$5","$6","memory"
> +        "gssdlc1 $f2, 7(%[src])             \r\n"
> +        "gssdrc1 $f2, 0(%[src])             \r\n"
> +        "gssdlc1 $f4, 15(%[src])            \r\n"
> +        "gssdrc1 $f4, 8(%[src])             \r\n"
> +        "daddu %[src], %[src], %[stride]    \r\n"
> +        "daddi $8, $8, -1                   \r\n"
> +        "bnez $8, 1b                        \r\n"
> +        : [src]"+&r"(src)
> +        : [stride]"r"(stride),[srcA]"r"(src-stride)
> +        : "$8","$f2","$f4"
>      );
>  }


So... I'm confused. You're replacing one type of optimizations with
another. What happened? Was the old optimization bad? Was it for an old cpu
type and is yours for a newer one? Something else?

Ronald


More information about the ffmpeg-devel mailing list