[FFmpeg-devel] [PATCH v3 2/2] swscale/input: clip rgbf32 values before lrintf
James Almer
jamrial at gmail.com
Mon Nov 15 21:52:08 EET 2021
On 11/15/2021 12:29 PM, Michael Niedermayer wrote:
> On Sun, Nov 14, 2021 at 10:22:21PM -0800, mindmark at gmail.com wrote:
>> From: Mark Reid <mindmark at gmail.com>
>>
>> if the float pixel * 65535.0f > 2147483647.0f
>> lrintf may overfow and return negative values, depending on implementation.
>> nan and +/-inf values may also be implementation defined
>>
>> clip the value first so lrintf always works.
>>
>> values < 0.0f, -inf, nan = 0.0f
>> values > 65535.0f, +inf = 65535.0f
>>
>> old timings
>> 195960 decicycles in planar_rgbf32le_to_uv, 1 runs, 0 skips
>> 186120 decicycles in planar_rgbf32le_to_uv, 2 runs, 0 skips
>> 188645 decicycles in planar_rgbf32le_to_uv, 4 runs, 0 skips
>> 183625 decicycles in planar_rgbf32le_to_uv, 8 runs, 0 skips
>> 181157 decicycles in planar_rgbf32le_to_uv, 16 runs, 0 skips
>> 177533 decicycles in planar_rgbf32le_to_uv, 32 runs, 0 skips
>> 175689 decicycles in planar_rgbf32le_to_uv, 64 runs, 0 skips
>>
>> 232960 decicycles in planar_rgbf32be_to_uv, 1 runs, 0 skips
>> 221380 decicycles in planar_rgbf32be_to_uv, 2 runs, 0 skips
>> 216640 decicycles in planar_rgbf32be_to_uv, 4 runs, 0 skips
>> 213505 decicycles in planar_rgbf32be_to_uv, 8 runs, 0 skips
>> 211558 decicycles in planar_rgbf32be_to_uv, 16 runs, 0 skips
>> 210596 decicycles in planar_rgbf32be_to_uv, 32 runs, 0 skips
>> 210202 decicycles in planar_rgbf32be_to_uv, 64 runs, 0 skips
>>
>> 161680 decicycles in planar_rgbf32le_to_y, 1 runs, 0 skips
>> 153540 decicycles in planar_rgbf32le_to_y, 2 runs, 0 skips
>> 148255 decicycles in planar_rgbf32le_to_y, 4 runs, 0 skips
>> 140600 decicycles in planar_rgbf32le_to_y, 8 runs, 0 skips
>> 132935 decicycles in planar_rgbf32le_to_y, 16 runs, 0 skips
>> 128531 decicycles in planar_rgbf32le_to_y, 32 runs, 0 skips
>> 140933 decicycles in planar_rgbf32le_to_y, 64 runs, 0 skips
>>
>> 190980 decicycles in planar_rgbf32be_to_y, 1 runs, 0 skips
>> 176080 decicycles in planar_rgbf32be_to_y, 2 runs, 0 skips
>> 167980 decicycles in planar_rgbf32be_to_y, 4 runs, 0 skips
>> 164685 decicycles in planar_rgbf32be_to_y, 8 runs, 0 skips
>> 162751 decicycles in planar_rgbf32be_to_y, 16 runs, 0 skips
>> 162404 decicycles in planar_rgbf32be_to_y, 32 runs, 0 skips
>> 167849 decicycles in planar_rgbf32be_to_y, 64 runs, 0 skips
>>
>> new timings
>> 183320 decicycles in planar_rgbf32le_to_uv, 1 runs, 0 skips
>> 175700 decicycles in planar_rgbf32le_to_uv, 2 runs, 0 skips
>> 179570 decicycles in planar_rgbf32le_to_uv, 4 runs, 0 skips
>> 172932 decicycles in planar_rgbf32le_to_uv, 8 runs, 0 skips
>> 168707 decicycles in planar_rgbf32le_to_uv, 16 runs, 0 skips
>> 165224 decicycles in planar_rgbf32le_to_uv, 32 runs, 0 skips
>> 163423 decicycles in planar_rgbf32le_to_uv, 64 runs, 0 skips
>>
>> 184940 decicycles in planar_rgbf32be_to_uv, 1 runs, 0 skips
>> 185150 decicycles in planar_rgbf32be_to_uv, 2 runs, 0 skips
>> 185790 decicycles in planar_rgbf32be_to_uv, 4 runs, 0 skips
>> 185472 decicycles in planar_rgbf32be_to_uv, 8 runs, 0 skips
>> 185277 decicycles in planar_rgbf32be_to_uv, 16 runs, 0 skips
>> 185813 decicycles in planar_rgbf32be_to_uv, 32 runs, 0 skips
>> 185332 decicycles in planar_rgbf32be_to_uv, 64 runs, 0 skips
>>
>> 145400 decicycles in planar_rgbf32le_to_y, 1 runs, 0 skips
>> 145100 decicycles in planar_rgbf32le_to_y, 2 runs, 0 skips
>> 143490 decicycles in planar_rgbf32le_to_y, 4 runs, 0 skips
>> 136687 decicycles in planar_rgbf32le_to_y, 8 runs, 0 skips
>> 131271 decicycles in planar_rgbf32le_to_y, 16 runs, 0 skips
>> 128698 decicycles in planar_rgbf32le_to_y, 32 runs, 0 skips
>> 127170 decicycles in planar_rgbf32le_to_y, 64 runs, 0 skips
>>
>> 156020 decicycles in planar_rgbf32be_to_y, 1 runs, 0 skips
>> 146990 decicycles in planar_rgbf32be_to_y, 2 runs, 0 skips
>> 142020 decicycles in planar_rgbf32be_to_y, 4 runs, 0 skips
>> 141052 decicycles in planar_rgbf32be_to_y, 8 runs, 0 skips
>> 138973 decicycles in planar_rgbf32be_to_y, 16 runs, 0 skips
>> 138027 decicycles in planar_rgbf32be_to_y, 32 runs, 0 skips
>> 143939 decicycles in planar_rgbf32be_to_y, 64 runs, 0 skips
>
> LGTM
>
> thx
Applied.
More information about the ffmpeg-devel
mailing list