[FFmpeg-devel] [RFC][PATCH] swscale: NEON optimized unscaled rgba to nv12 conversion

Michael Niedermayer michaelni at gmx.at
Wed Dec 4 12:54:35 CET 2013


On Wed, Dec 04, 2013 at 02:57:30PM +0800, Yu Xiaolei wrote:
> Added copyright headers.
> 
> RGB2YUV coeffs are loaded from SwsContext, but signs are still hardcoded.
> If this is not acceptable, I will rewrite it using multiply by scalar
> at the cost of several more instructions (widening operations) per
> 16x2 block.
> 

> Conversion is done in unsigned 16bit math. There will be rounding
> errors compared to c implementation.

then SWS_ACCURATE_RND should be checked before using the neon code

[...]

> +static void get_rgb2yuv_table(SwsContext *context, uint8_t dst[9]) {
> +    int32_t *src = context->input_rgb2yuv_table;
> +
> +    dst[RY_IDX] = RSHIFT(src[RY_IDX], RGB2YUV_SHIFT - 8);
> +    dst[GY_IDX] = RSHIFT(src[GY_IDX], RGB2YUV_SHIFT - 8);
> +    dst[BY_IDX] = RSHIFT(src[BY_IDX], RGB2YUV_SHIFT - 8);
> +    dst[RU_IDX] = - RSHIFT(src[RU_IDX], RGB2YUV_SHIFT - 8);
> +    dst[GU_IDX] = - RSHIFT(src[GU_IDX], RGB2YUV_SHIFT - 8);
> +    dst[BU_IDX] = RSHIFT(src[BU_IDX], RGB2YUV_SHIFT - 8);
> +    dst[RV_IDX] = RSHIFT(src[RV_IDX], RGB2YUV_SHIFT - 8);
> +    dst[GV_IDX] = - RSHIFT(src[GV_IDX], RGB2YUV_SHIFT - 8);
> +    dst[BV_IDX] = - RSHIFT(src[BV_IDX], RGB2YUV_SHIFT - 8);
> +}

> +
> +static int rgbx_to_nv12_neon_wrapper(SwsContext *context, const uint8_t *src[],
> +                        int srcStride[], int srcSliceY, int srcSliceH,
> +                        uint8_t *dst[], int dstStride[]) {
> +    uint8_t table[9];
> +
> +    int src_pixel_width = srcStride[0] / 4;
> +    int y_pixel_width = dstStride[0];
> +    int c_pixel_width = dstStride[1] / 2;
> +
> +    int aligned_width = FFALIGN(context->srcW, 16);
> +    int width;
> +
> +    if (aligned_width <= src_pixel_width
> +            && aligned_width <= y_pixel_width
> +            && aligned_width <= c_pixel_width) {
> +        width = aligned_width;
> +    } else {
> +        width = context->srcW;
> +    }
> +

> +    get_rgb2yuv_table(context, table);

this could be done at init time
note, the XY_IDX entries are reserved for the C implementation, and
should not be overwritten




> +
> +    av_log(context, AV_LOG_INFO, "src(%p) y(%p) chroma(%p)\n",
> src[0], dst[0], dst[1]);
> +    av_log(context, AV_LOG_INFO, "srcStride(%d) yStride(%d) cStride(%d)\n",
> +            srcStride[0], dstStride[0], dstStride[1]);

stray debug code


[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Everything should be made as simple as possible, but not simpler.
-- Albert Einstein
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20131204/d8e97ee4/attachment.asc>


More information about the ffmpeg-devel mailing list