[FFmpeg-devel] [PATCH] swscale alpha channel support

Fri Feb 27 21:34:04 CET 2009

On Fri, Feb 27, 2009 at 06:28:38PM +0100, C?dric Schieli wrote:
> Hi all,
> 
> 
> Here is my complete patch set to bring alpha channel support in swscale.

first the things found by my evil patcheck script

missing const?
sws_scale_alpha.patch:875:+static inline void RENAME(bgr32ToA)(uint8_t *dst, uint8_t *src, long width, uint32_t *unused){
sws_scale_alpha.patch:886:+static inline void RENAME(bgr32_1ToA)(uint8_t *dst, uint8_t *src, long width, uint32_t *unused){
sws_yuva2rgb.patch:274:+static inline int RENAME(yuva420_rgb32)(SwsContext *c, uint8_t* src[], int srcStride[], int srcSliceY,

missing const (test2)?
sws_yuva2rgb.patch:274:+static inline int RENAME(yuva420_rgb32)(SwsContext *c, uint8_t* src[], int srcStride[], int srcSliceY,

missing } prior to else
sws_scale_alpha.patch:915:+        else
sws_scale_alpha.patch:923:+        else
sws_scale_alpha.patch:933:+        else
sws_scale_alpha.patch:941:+        else

Missing changelog entry (ignore if minor change)

Mergeable calls
printf(",%5lld", ssdA);
+    printf

> 
> 
> #1 : sws_yuv2rgb_factorize.patch
> Pre-factorize part of yuv420_rgb32 in yuv2rgb_template.c to prepare
> for a yuva420_rgb32 function

like always, if striped objects are the same then ok

> 
> #2 : avpicture_layout_yuva420p.patch
> Fix avpicture_layout to not chroma shift the alpha plane when
> outputting yuva420p (to be able to output with -f rawvideo -pix_fmt
> yuva420p)

comments below

> 
> #3 : sws_treat_yuva420p_as_yuv420p.patch
> Use (srcFormat==yuv420p) codepath for the (srcFormat=yuva420p) case in
> various places where it is appropriate

ok

> 
> #4 : sws_default_alpha_value.patch
> When converting from a non alpha format to an alpha format, defaults
> to all ones rather than all zeroes
> This patch introduces some (small) differences in swscale-exemple
> output for RGB32_1 and BGR32_1 (see swscale-example.log.diff) that I
> can't reproduce manually

these look rather harmless, maybe some uninitialized var or missing emms,
i doubt there is a bug in your code
anyway comments below

> 
> #5 : sws_configure_alpha.patch
> Adds --enable-swscale-alpha configure option and a isALPHA macro

comments below

> 
> #6 : sws_yuva2rgb.patch
> Unscaled yuva420p to rgb32 C/MMX converters

comments below

> 
> #7 : sws_scale_alpha.patch
> Scale the alpha channel

will review later

> 
> #8 : sws_output_yuva420p.patch
> Now that yuva420p can be scaled, it can be added to output supported format

ok

> 
> #9 : swscale-example_use_alpha.patch
> Also compute alpha channel differences in swscale-example

not reviewed

> 
> 
> One remaining issue is a strange difference on x86_64 for some cases
> (see swscale-example.x86_64.log.diff). After some debugging, it seems
> it comes from #9. I'll invetigate more on this one.

until that is solved #9 (or whatever causes it) is not acceptable
what has to be noted is that all of the differences are for flags=1
and involve either yuv410p or yuv411p

[...]

#2
> Index: ffmpeg/libavcodec/imgconvert.c
> ===================================================================
> --- ffmpeg.orig/libavcodec/imgconvert.c	2009-02-27 11:35:30.086183618 +0100
> +++ ffmpeg/libavcodec/imgconvert.c	2009-02-27 11:35:49.874181190 +0100
> @@ -721,7 +721,7 @@ int avpicture_layout(const AVPicture* sr
>                       unsigned char *dest, int dest_size)
>  {
>      const PixFmtInfo* pf = &pix_fmt_info[pix_fmt];
> -    int i, j, w, h, data_planes;
> +    int i, j, w, ow, h, oh, data_planes;
>      const unsigned char* s;
>      int size = avpicture_get_size(pix_fmt, width, height);
>  

> @@ -753,8 +753,13 @@ int avpicture_layout(const AVPicture* sr
>  
>      for (i=0; i<data_planes; i++) {
>           if (i == 1) {
> +             ow = w;
> +             oh = h;

id do this before the loop

[...]

#4
[...]
> @@ -1661,11 +1661,17 @@ static inline void RENAME(name ## _half)
>      int i;\
>      for (i=0; i<width; i++)\
>      {\
> -        int pix0= ((type*)src)[2*i+0];\
> -        int pix1= ((type*)src)[2*i+1];\
> -        int g= (pix0&maskg)+(pix1&maskg);\
> -        int b= ((pix0+pix1-g)&(maskb|(2*maskb)))>>shb;\
> -        int r= ((pix0+pix1-g)&(maskr|(2*maskr)))>>shr;\
> +        int pix0, pix1, g, b, r;\
> +        if (alpha){\
> +            pix0= ((type*)src)[2*i+0]&(maskr|maskg|maskb);\
> +            pix1= ((type*)src)[2*i+1]&(maskr|maskg|maskb);\
> +        }else{\
> +            pix0= ((type*)src)[2*i+0];\
> +            pix1= ((type*)src)[2*i+1];\
> +        }\
> +        g= (pix0&maskg)+(pix1&maskg);\
> +        b= ((pix0+pix1-g)&(maskb|(2*maskb)))>>shb;\
> +        r= ((pix0+pix1-g)&(maskr|(2*maskr)))>>shr;\

int pix0= ((type*)src)[2*i+0];\
int pix1= ((type*)src)[2*i+1];\
int g= (pix0&(maskg|maska))+(pix1&(maskg|maska));\
int b= ((pix0+pix1-g)&(maskb|(2*maskb)))>>shb;\
int r= ((pix0+pix1-g)&(maskr|(2*maskr)))>>shr;\
g &= maskg|(2*maskg)

1 & less :)

[...]
> @@ -1245,6 +1246,8 @@ static inline void RENAME(rgb15to32)(con
>          "psllq        $16, %%mm5    \n\t"
>          "por        %%mm4, %%mm3    \n\t"
>          "por        %%mm5, %%mm3    \n\t"
> +        "por        %%mm6, %%mm0    \n\t"
> +        "por        %%mm6, %%mm3    \n\t"
>          MOVNTQ"     %%mm0,  %0      \n\t"
>          MOVNTQ"     %%mm3, 8%0      \n\t"
>          :"=m"(*d)

the original code is poorly written and needs to be fixed

after the psXlq there are
0R0R0R0R
0G0G0G0G
0B0B0B0B

after 3 packuswb
0000RRRR
0000GGGG
0000BBBB
(and one constant 0000FFFF)

after 2 "punpcklbw
RGRGRGRG
BFBFBFBF

after 1 mov and 2 punpckXwd
RGBFRBF
RGBFRBF

thats 8 instrutions instead of the 17 that are there
currently
similarly the other code can be improved

[...]

#5
> Index: ffmpeg/configure
> ===================================================================
> --- ffmpeg.orig/configure	2009-02-27 10:06:03.825934776 +0100
> +++ ffmpeg/configure	2009-02-27 11:36:09.822181736 +0100

not maintained by me

> Index: ffmpeg/libswscale/swscale_internal.h
> ===================================================================
> --- ffmpeg.orig/libswscale/swscale_internal.h	2009-02-27 10:06:03.841935207 +0100
> +++ ffmpeg/libswscale/swscale_internal.h	2009-02-27 11:36:09.826184312 +0100
> @@ -273,6 +273,13 @@ const char *sws_format_name(int format);
>          || (x)==PIX_FMT_MONOBLACK   \
>          || (x)==PIX_FMT_MONOWHITE   \
>      )
> +#define isALPHA(x)      (           \
> +           (x)==PIX_FMT_BGR32       \
> +        || (x)==PIX_FMT_BGR32_1     \
> +        || (x)==PIX_FMT_RGB32       \
> +        || (x)==PIX_FMT_RGB32_1     \
> +        || (x)==PIX_FMT_YUVA420P    \
> +    )
>  
>  static inline int fmt_depth(int fmt)
>  {

ok

#6
[...]
> +#define YUV2RGBFUNC(func_name, dst_type, alpha) \
>  static int func_name(SwsContext *c, uint8_t* src[], int srcStride[], int srcSliceY, \
>                       int srcSliceH, uint8_t* dst[], int dstStride[]){\
>      int y;\
>  \

> -    if (c->srcFormat == PIX_FMT_YUV422P) {\
> +    if (!alpha && c->srcFormat == PIX_FMT_YUV422P) {\
>          srcStride[1] *= 2;\
>          srcStride[2] *= 2;\
>      }\

why?

[...]
> Index: ffmpeg/libswscale/yuv2rgb_template.c
> ===================================================================
> --- ffmpeg.orig/libswscale/yuv2rgb_template.c	2009-02-27 11:36:06.914181648 +0100
> +++ ffmpeg/libswscale/yuv2rgb_template.c	2009-02-27 15:40:20.681930923 +0100
> @@ -451,3 +451,23 @@ static inline int RENAME(yuv420_rgb32)(S
>  
>      YUV2RGB_ENDLOOP(4)
>  }
> +
> +#if CONFIG_SWSCALE_ALPHA
> +static inline int RENAME(yuva420_rgb32)(SwsContext *c, uint8_t* src[], int srcStride[], int srcSliceY,
> +                                        int srcSliceH, uint8_t* dst[], int dstStride[]){
> +    int y, h_size;
> +
> +    YUV2RGB_LOOP(4)
> +
> +        *(uint8_t **)(&c->u_temp) = src[3] + y*srcStride[3] - 2*index; /* pa-2index */
> +        YUV2RGB_INIT
> +        YUV2RGB
> +        "mov               %5, "ESP_OFFSET"(%4);" /* Backup py-2index */
> +        "mov     "U_TEMP"(%4), %5;"               /* Fetch pa-2index */
> +        "movq     (%5, %0, 2), %%mm3;"            /* Load 8 A A7 A6 A5 A4 A3 A2 A1 A0 */
> +        "mov "ESP_OFFSET"(%4), %5;"               /* Restore py-2index */
> +        RGB_PLANAR2PACKED32
> +
> +    YUV2RGB_ENDLOOP(4)
> +}
> +#endif

there are 7 registers the asm uses just 6 so this save&restore should be
avoidable even if we then have to fallback to C when HAVE_7REGS is not
set

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Awnsering whenever a program halts or runs forever is
On a turing machine, in general impossible (turings halting problem).
On any real computer, always possible as a real computer has a finite number
of states N, and will either halt in less than N cycles or never halt.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20090227/0ecef7ca/attachment.pgp>