[FFmpeg-devel] [PATCHv3] checkasm/lpc: test compute_autocorr

James Almer jamrial at gmail.com
Tue May 28 05:55:07 EEST 2024


On 5/27/2024 9:22 PM, James Almer wrote:
> On 5/27/2024 4:15 PM, James Almer wrote:
>> On 5/27/2024 4:10 PM, James Almer wrote:
>>> On 5/27/2024 1:01 PM, Rémi Denis-Courmont wrote:
>>>> ---
>>>> Changes since v2:
>>>> - Scale the error factor to length since this computes sums.
>>>> - Check the last element from results.
>>>> - Use fixed vector size for benchmarks.
>>>>
>>>> ---
>>>>   tests/checkasm/lpc.c | 51 
>>>> +++++++++++++++++++++++++++++++++++++++++---
>>>>   1 file changed, 48 insertions(+), 3 deletions(-)
>>>
>>> checkasm: using random seed 883526087
>>> checkasm: bench runs 1024 (1 << 10)
>>> SSE2:
>>>   - lpc.apply_welch_window_even [OK]
>>>   - lpc.apply_welch_window_odd  [OK]
>>> 8:  666.011902576448 -  665.600444506565 =  0.411458069884
>>>     autocorr_8_sse2 (lpc.c:88)
>>>   - lpc.compute_autocorr        [FAILED]
>>
>> The following fixes it:
>>
>>> diff --git a/libavcodec/x86/lpc_init.c b/libavcodec/x86/lpc_init.c
>>> index f2fca53799..9f41639feb 100644
>>> --- a/libavcodec/x86/lpc_init.c
>>> +++ b/libavcodec/x86/lpc_init.c
>>> @@ -99,6 +99,15 @@ static void lpc_compute_autocorr_sse2(const double 
>>> *data, ptrdiff_t len, int lag
>>>              );
>>>          }
>>>      }
>>> +
>>> +    if(j==lag){
>>> +        double sum = 1.0;
>>> +        for(int i=j-1; i<len; i+=2){
>>> +            sum += data[i  ] * data[i-j  ]
>>> +                 + data[i+1] * data[i-j+1];
>>> +        }
>>> +        autoc[j] = sum;
>>> +    }
>>>  }
>>>
>>>  #endif /* HAVE_SSE2_INLINE */
>>
>> So the SSE2 version is effectively broken, and ideally should be 
>> ported to nasm as it's fixed.
> 
> Actually, that only fixes setting the last value. There are still 
> failures in random places using several different seeds.

So the failures are only on odd input len values (With the change above 
to set the last output value applied, of course).
Maybe add both a test for even and odd values, same as we do for 
apply_welch_window, and disable the latter until the sse2 function is fixed.

I guess odd values are never really used in actual encoding scenarios 
seeing how fate is unaffected.


More information about the ffmpeg-devel mailing list