[FFmpeg-devel] [PATCH] avcodec: use av_mod_intp2() where possible
James Almer
jamrial at gmail.com
Wed Apr 22 03:48:27 CEST 2015
On 21/04/15 7:49 PM, James Almer wrote:
> On 21/04/15 7:12 PM, Michael Niedermayer wrote:
>> On Tue, Apr 21, 2015 at 06:51:52PM -0300, James Almer wrote:
>>> On 21/04/15 6:38 PM, Michael Niedermayer wrote:
>>>> On Tue, Apr 21, 2015 at 05:45:25PM -0300, James Almer wrote:
>>>>> Signed-off-by: James Almer <jamrial at gmail.com>
>>>>> diff --git a/libavcodec/ffv1.h b/libavcodec/ffv1.h
>>>>> index 5081397..bfc4d71 100644
>>>>> --- a/libavcodec/ffv1.h
>>>>> +++ b/libavcodec/ffv1.h
>>>>> @@ -143,7 +143,7 @@ static av_always_inline int fold(int diff, int bits)
>>>>> diff = (int8_t)diff;
>>>>> else {
>>>>> diff += 1 << (bits - 1);
>>>>> - diff &= (1 << bits) - 1;
>>>>> + diff = av_mod_uintp2(diff, bits);
>>>>> diff -= 1 << (bits - 1);
>>>>> }
>>>>>
>>>>
>>>> iam not sure some of these changes are are a good idea
>>>> for example above the mask has to be calculated anyway what can be
>>>> gained from av_mod_uintp2() use here ?
>>>> i think this should be bechmarked when its in performance critical code
>>>
>>> "1 << (bits - 1)" is not the same as "(1 << bits) - 1". The latter is not going to be
>>> calculated if the arch optimized version of av_mod_uintp2 is used.
>>
>> yep, i should not review patches when i am in a (IRC) meeting
>>
>> before the patch fold looks like this: (for the non 8bit case)
>> addl 100(%rsp), %ecx
>> andl 104(%rsp), %ecx
>> subl 100(%rsp), %ecx
>> .L869:
>> afterwards it looks like this:
>>
>> addl 104(%rsp), %ecx
>> andl 64(%rsp), %ecx
>> subl 104(%rsp), %ecx
>> .L869:
>>
>> so it seems ok
>>
>>
>>>
>>> The one file where i don't know if there's going to be any gain is in one of the opus_celt.c
>>> changes, where the mask needs to be calculated even if arch optimized av_mod_uintp2 is used.
>>> That one can be removed, but every other change is pretty straightforward.
>>
>> ok, then iam fine with the patch
>>
>> Thanks
>
> I'm going to apply the changes that are straightforward (single line changes that didn't use
> mask variables prior to this patch, or that created one but then used it once). av_mod_uintp2
> should generate the same assembly with the non bmi2 optimized version.
Done and pushed, thanks.
More information about the ffmpeg-devel
mailing list