[FFmpeg-devel] [PATCH]v6 Opus Pyramid Vector Quantization Search in x86 SIMD asm
Henrik Gramner
henrik at gramner.com
Sun Aug 6 14:39:24 EEST 2017
On Sat, Aug 5, 2017 at 12:58 AM, Ivan Kalvachev <ikalvachev at gmail.com> wrote:
> 8 packed, 8 scalar.
>
> Unless I miss something (and as I've said before,
> I'm not confident enough to mess with that code.)
>
> (AVX does extend to 32 variants, but they are not
> SSE compatible, so no need to emulate them.)
Oh, right. I quickly glanced at the docs and saw 32 pseudo-ops for
each instruction for a total of 128 when adding pd, ps, sd, ss, but
the fact that only the first 8 is relevant here reduces it to 32 which
is a lot more manageable.
> movaps m1, [WRT_PIC_BASE + const_2 + r2 ]
>
> Looks better. (Also not tested. Will do, later.)
I intentionally used the WRT define at the end because that's most
similar to the built in wrt syntax used when accessing symbols through
the PLT or GOT, e.g.
mov eax, [external_symbol wrt ..got]
> Yeh $$ is the start of the current section, and that's is going to be
> ".text" not "rodata".
Obviously, yes. You need a reference that results in a compile-time
constant PC-offset (which .rodata isn't) to create PC-relative
relocation records to external symbols.
More information about the ffmpeg-devel
mailing list