[FFmpeg-devel] [PATCH] Use av_clip_uint8 in swscale.

Frank Barchard fbarchard
Mon Aug 17 20:13:06 CEST 2009


2009/8/17 M?ns Rullg?rd <mans at mansr.com>

> Frank Barchard <fbarchard at google.com> writes:
>
> > The table method works well on all platforms... better than if statements
> > anyway.
>
> Depends on the range of inputs.  If you want to allow the full 32-bit
> range, well...  Even a smaller range could put significant pressure on
> the cache.


In practice you know the range of values.  If you combined 3 bytes, its 768
values.
I know if statements are increasingly efficient, and memory less efficient,
but the original code had 4 to 6 instructions and potentially 2 branches
taken per clipped value.
av_clip_uint8() can be optimized to a single instruction on most CPU's


>
>
> > On x86, there is cmov, but in the above code it would take cmp, cmov,
> cmp,
> > cmov to do each value, whereas the table method takes one mov
> instruction.
>
> You're forgetting the address calculation.


movzx eax,cliptbl[eax*4]
ARM can do a shift on indices too.



More information about the ffmpeg-devel mailing list