[FFmpeg-devel] 回复: [PATCH 1/3] avcodec/x86/vvc/vvc_alf: fix integer overflow
Wu Jianhua
toqsxw at outlook.com
Thu May 30 19:31:57 EEST 2024
Ronald S. Bultje:
> 发件人: Ronald S. Bultje <rsbultje at gmail.com>
> 发送时间: 2024年5月29日 13:56
> 收件人: Wu Jianhua
> 抄送: FFmpeg development discussions and patches; Nuo Mi; James Almer
> 主题: Re: [FFmpeg-devel] [PATCH 1/3] avcodec/x86/vvc/vvc_alf: fix integer overflow
>
> Hi,
>
> On Wed, May 29, 2024 at 3:44 PM Wu Jianhua <toqsxw at outlook.com<mailto:toqsxw at outlook.com>> wrote:
> Ronald S. Bultje:
>> On Wed, May 29, 2024 at 11:38 AM <toqsxw at outlook.com<mailto:toqsxw at outlook.com>> <mailto:toqsxw at outlook.com<mailto:toqsxw at outlook.com>>> wrote:
>> +%else
>> + vpunpcklqdq m11, m2, m2
>> + vpunpckhqdq m12, m2, m2
>> + vpunpcklwd m11, m11, m14
>> + vpunpcklwd m12, m12, m14
>> + paddd m0, m11
>> + paddd m1, m12
>> + packssdw m0, m0, m1
>> +%endif
>
> [..]
> > Also, the whole thing just emulates a saturated add. Can't you use paddsw instead of paddw and be done with it? To add to Andreas' question: is >> saturating here normatively required?
>
> > We didn't have any sample that failed for this issue except for the checksum with specific seeds. I think we can keep not changing it until a real sample has something wrong.
>
> @Nuomi to get more details.
>
> I think "just" replacing paddw with paddsw is correct, since the input pixels are 12bit (so they could be either unsigned or signed), the filtered output > is the result of packssdw (so signed words), and the desired output is 12bit pixels anyway, anything greater than that is clipped to 12bit range. So to > me, it seems paddsw is a cheaper way to accomplish the same thing.
>
> Ronald
Hi Ronald,
Yes, it does. I've test paddsw and everything works well. It must be a cheaper way to get minimum performance loss.
And v2 sent.
Thanks for this.
Jianhua
More information about the ffmpeg-devel
mailing list