[FFmpeg-devel] [PATCH] swscale: add unscaled copy from yuv420p10 to p010

Timo Rothenpieler timo at rothenpieler.org
Fri Sep 2 12:12:56 EEST 2016


> Just sticking my head above the parapet, but shouldn’t things like...
> 
>> +            for (x = 0; x < c->srcW / 2; x++) {
>> +                dstUV[x*2  ] = src[1][x] << 6;
>> +                dstUV[x*2+1] = src[2][x] << 6;
>> +            }
> 
> …be more efficiently written as...
> 
> uint16_t* tdstUV = dstUV;
> uint16_t* tsrc1 = src[1];
> uint16_t* tsrc2 = src[2];
> for (x = c->srcW / 2; x > 0; x--) {
>     *tdstUV++ = *tsrc1++ << 6;
>     *tdstUV++ = *tsrc2++ << 6;
> }
> 
> …or is that really old-school and a modern compiler does all that when optimising?
> 
> Or is readability considered more important than marginal gains in performance?
> 
> Oliver (time travelling from the 1980s)

You would still have to add the remaining stride.
The linesize is usually larger than the width, so each line is properly
aligned.

So with your code, you'd still need something like

dstUV += dstStride[1] / 2 - 2 * x;
src[2] += srcStride[1] / 2 - x;
src[2] += srcStride[1] / 2 - x;

after it.


More information about the ffmpeg-devel mailing list