[FFmpeg-devel] [RFC][PATCH] swscale: NEON optimized unscaled rgba to nv12 conversion
Michael Niedermayer
michaelni at gmx.at
Wed Dec 4 12:54:35 CET 2013
On Wed, Dec 04, 2013 at 02:57:30PM +0800, Yu Xiaolei wrote:
> Added copyright headers.
>
> RGB2YUV coeffs are loaded from SwsContext, but signs are still hardcoded.
> If this is not acceptable, I will rewrite it using multiply by scalar
> at the cost of several more instructions (widening operations) per
> 16x2 block.
>
> Conversion is done in unsigned 16bit math. There will be rounding
> errors compared to c implementation.
then SWS_ACCURATE_RND should be checked before using the neon code
[...]
> +static void get_rgb2yuv_table(SwsContext *context, uint8_t dst[9]) {
> + int32_t *src = context->input_rgb2yuv_table;
> +
> + dst[RY_IDX] = RSHIFT(src[RY_IDX], RGB2YUV_SHIFT - 8);
> + dst[GY_IDX] = RSHIFT(src[GY_IDX], RGB2YUV_SHIFT - 8);
> + dst[BY_IDX] = RSHIFT(src[BY_IDX], RGB2YUV_SHIFT - 8);
> + dst[RU_IDX] = - RSHIFT(src[RU_IDX], RGB2YUV_SHIFT - 8);
> + dst[GU_IDX] = - RSHIFT(src[GU_IDX], RGB2YUV_SHIFT - 8);
> + dst[BU_IDX] = RSHIFT(src[BU_IDX], RGB2YUV_SHIFT - 8);
> + dst[RV_IDX] = RSHIFT(src[RV_IDX], RGB2YUV_SHIFT - 8);
> + dst[GV_IDX] = - RSHIFT(src[GV_IDX], RGB2YUV_SHIFT - 8);
> + dst[BV_IDX] = - RSHIFT(src[BV_IDX], RGB2YUV_SHIFT - 8);
> +}
> +
> +static int rgbx_to_nv12_neon_wrapper(SwsContext *context, const uint8_t *src[],
> + int srcStride[], int srcSliceY, int srcSliceH,
> + uint8_t *dst[], int dstStride[]) {
> + uint8_t table[9];
> +
> + int src_pixel_width = srcStride[0] / 4;
> + int y_pixel_width = dstStride[0];
> + int c_pixel_width = dstStride[1] / 2;
> +
> + int aligned_width = FFALIGN(context->srcW, 16);
> + int width;
> +
> + if (aligned_width <= src_pixel_width
> + && aligned_width <= y_pixel_width
> + && aligned_width <= c_pixel_width) {
> + width = aligned_width;
> + } else {
> + width = context->srcW;
> + }
> +
> + get_rgb2yuv_table(context, table);
this could be done at init time
note, the XY_IDX entries are reserved for the C implementation, and
should not be overwritten
> +
> + av_log(context, AV_LOG_INFO, "src(%p) y(%p) chroma(%p)\n",
> src[0], dst[0], dst[1]);
> + av_log(context, AV_LOG_INFO, "srcStride(%d) yStride(%d) cStride(%d)\n",
> + srcStride[0], dstStride[0], dstStride[1]);
stray debug code
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
Everything should be made as simple as possible, but not simpler.
-- Albert Einstein
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20131204/d8e97ee4/attachment.asc>
More information about the ffmpeg-devel
mailing list