[FFmpeg-devel] [PATCH] Add non native 16bits RGB/BGR output support to libswscale

Michael Niedermayer michaelni
Thu Aug 13 01:56:43 CEST 2009


On Thu, Aug 13, 2009 at 12:26:43AM +0200, Alexis Ballier wrote:
> >> Per subject. C version only as I'm really not sure how to deal with
> >> the mmx versions; suggestions are very welcome.
> >
> > bswap16 their output?
> >
> > we already have quite a bit of code, bloat for obscure formats seems
> > silly, and non native endian formats seem obscure ...
> 
> For me non native endian formats have a real point: allow encoders to
> specify what endianness they want. I've been playing with an mmx
> version since and have a (may be not optimal) version that depends on
> this patch that i'll submit later and leave it to your judgement; it
> uses 6 instructions more than native endian format for 8 pixels, thus
> 2 less than what optimal we might expect by byteswapping the output
> (or what seems to be commonly done in current encoders by using
> AV_WB/L16 and an if() inside the output loop). The real gain, of
> course, is that we use the full mmx converter to get the raw frame
> instead of the C version.
> 
> >> @@ -909,8 +917,8 @@
> >> ? ? ? ? ? ? ?dest+=6;\
> >> ? ? ? ? ?}\
> >> ? ? ? ? ?break;\
> >> - ? ?case PIX_FMT_RGB565:\
> >> - ? ?case PIX_FMT_BGR565:\
> >> + ? ?case PIX_FMT_NE(RGB565BE,RGB565LE):\
> >> + ? ?case PIX_FMT_NE(BGR565BE,BGR565LE):\
> >> ? ? ? ? ?{\
> >> ? ? ? ? ? ? ?const int dr1= dither_2x2_8[y&1 ? ?][0];\
> >> ? ? ? ? ? ? ?const int dg1= dither_2x2_4[y&1 ? ?][0];\
> >> @@ -924,10 +932,25 @@
> >> ? ? ? ? ? ? ?}\
> >> ? ? ? ? ?}\
> >> ? ? ? ? ?break;\
> >> - ? ?case PIX_FMT_RGB555:\
> >> - ? ?case PIX_FMT_BGR555:\
> >> + ? ?case PIX_FMT_NE(RGB565LE,RGB565BE):\
> >> + ? ?case PIX_FMT_NE(BGR565LE,BGR565BE):\
> >
> > pointless?
> 
> Somehow I was sure you wouldn't like it. It was there to preserve the
> symmetry and in the long term perhaps allow to deprecate the formats
> that depend on machine endianness. Enclosed is an updated patch.

one can maintain the symmetry by providing a clear name for the non native,
byteswapped format.

[...]
> @@ -924,6 +932,21 @@
>              }\
>          }\
>          break;\
> +    case PIX_FMT_NE(RGB565LE,RGB565BE):\
> +    case PIX_FMT_NE(BGR565LE,BGR565BE):\
> +        {\
> +            const int dr1= dither_2x2_8[y&1    ][0];\
> +            const int dg1= dither_2x2_4[y&1    ][0];\
> +            const int db1= dither_2x2_8[(y&1)^1][0];\
> +            const int dr2= dither_2x2_8[y&1    ][1];\
> +            const int dg2= dither_2x2_4[y&1    ][1];\
> +            const int db2= dither_2x2_8[(y&1)^1][1];\
> +            func(uint16_t,0)\
> +                ((uint16_t*)dest)[i2+0]= bswap_16(r[Y1+dr1] + g[Y1+dg1] + b[Y1+db1]);\
> +                ((uint16_t*)dest)[i2+1]= bswap_16(r[Y2+dr2] + g[Y2+dg2] + b[Y2+db2]);\
> +            }\
> +        }\
> +        break;\
>      case PIX_FMT_RGB555:\
>      case PIX_FMT_BGR555:\
>          {\
> @@ -939,6 +962,21 @@
>              }\
>          }\
>          break;\
> +    case PIX_FMT_NE(RGB555LE,RGB555BE):\
> +    case PIX_FMT_NE(BGR555LE,BGR555BE):\
> +        {\
> +            const int dr1= dither_2x2_8[y&1    ][0];\
> +            const int dg1= dither_2x2_8[y&1    ][1];\
> +            const int db1= dither_2x2_8[(y&1)^1][0];\
> +            const int dr2= dither_2x2_8[y&1    ][1];\
> +            const int dg2= dither_2x2_8[y&1    ][0];\
> +            const int db2= dither_2x2_8[(y&1)^1][1];\
> +            func(uint16_t,0)\
> +                ((uint16_t*)dest)[i2+0]= bswap_16(r[Y1+dr1] + g[Y1+dg1] + b[Y1+db1]);\
> +                ((uint16_t*)dest)[i2+1]= bswap_16(r[Y2+dr2] + g[Y2+dg2] + b[Y2+db2]);\
> +            }\
> +        }\
> +        break;\
>      case PIX_FMT_RGB8:\
>      case PIX_FMT_BGR8:\
>          {\

by how much do these increase the object size?
and how much would a if() in the inner loop of the rgb565/555 formats do,
also what are the speed effects of these 2 options

sws is becoming quite bloated with all these functions and i would like to
avoid future bloat

also as a 3rd alternative whats the speed with a seperate bswap pass ?

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Everything should be made as simple as possible, but not simpler.
-- Albert Einstein
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20090813/30d4c957/attachment.pgp>



More information about the ffmpeg-devel mailing list