[FFmpeg-devel] [PATCH v2] avcodec/arm/sbcenc: avoid callee preserved vfp registers
Martin Storsjö
martin at martin.st
Mon Sep 12 14:42:10 EEST 2022
On Sun, 25 Aug 2019, James Cowgill wrote:
> When compiling FFmpeg with GCC-9, some very random segfaults were
> observed in code which had previously called down into the SBC encoder
> NEON assembly routines. This was caused by these functions clobbering
> some of the vfp callee saved registers (d8 - d15 aka q4 - q7). GCC was
> using these registers to save local variables, but after these
> functions returned, they would contain garbage.
>
> Fix by reallocating the registers in the two affected functions in
> the following way:
> ff_sbc_analyze_4_neon: q2-q5 => q8-q11, then q1-q4 => q8-q11
> ff_sbc_analyze_8_neon: q2-q9 => q8-q15
>
> The reason for using these replacements is to keep closely related
> sets of registers consecutively numbered which hopefully makes the
> code more easy to follow. Since this commit only reallocates
> registers, it should have no performance impact.
>
> Signed-off-by: James Cowgill <jcowgill at debian.org>
> ---
>
> On 29/07/2019 19:59, Reimar Döffinger wrote:
>> Seems sensible to me, though extra points if you or someone has numbers on performance impact.
>> To know whether it would be worthwhile to check if it can be optimized...
>
> Sorry for the long delay - been on various holidays.
Sorry for the even longer response ;-) I happened to run into this patch
downstream, and noticed that it does look reasonable, but apparently the
second round of the patch was missed back then in 2019.
Our current code is indeed broken and wrong - if we would have had
checkasm tests for it, this issue would have been caught long ago.
> I did a few tests on my original patch and overall it was about 2%
> slower than before. In any case I think this new patch is a better
> solution (although the diff is a lot larger). We don't actually need
> that many registers in either of these functions, so instead of
> pushing the clobbered callee saved registers, we can reallocate all
> the registers to avoid them in the first place. This way there is no
> performance impact.
>
> I couldn't find any tests for this encoder, but I have tested a few
> audio samples with it and verified the output is identical to what t
> was before (and with what I get on x86).
Thanks a lot for doing that! Indeed that's the best we can do since we
don't have tests for it.
I'll go ahead and push this patch soon.
// Martin
More information about the ffmpeg-devel
mailing list