[FFmpeg-devel] [PATCH] x86/flacdsp: add SSE2 and AVX decorrelate functions
Clément Bœsch
u at pkh.me
Sun Nov 2 23:58:08 CET 2014
On Sun, Nov 02, 2014 at 07:55:35PM -0300, James Almer wrote:
> On 02/11/14 7:43 PM, Clément Bœsch wrote:
> > On Sun, Nov 02, 2014 at 07:31:48PM -0300, James Almer wrote:
> >> Two to four times faster depending on instruction set, block size and channel count.
> >>
> >> Signed-off-by: James Almer <jamrial at gmail.com>
> >> ---
> >> TODO: 16 bits indep for 4, 6 and 8 channels. 24/32 bits indep for 8 channels.
> >> AVX2 and maybe MMX versions.
> >> Planar?
> >>
> >> libavcodec/arm/flacdsp_init_arm.c | 2 +-
> >> libavcodec/flacdec.c | 6 +-
> >> libavcodec/flacdsp.c | 6 +-
> >> libavcodec/flacdsp.h | 6 +-
> >> libavcodec/flacenc.c | 2 +-
> >> libavcodec/x86/flacdsp.asm | 206 ++++++++++++++++++++++++++++++++++++++
> >> libavcodec/x86/flacdsp_init.c | 48 ++++++++-
> >> 7 files changed, 264 insertions(+), 12 deletions(-)
> > [...]
> >> + mova m0, [in0q]
> >> + mova m1, [in0q+in1q]
> >> +%if %1 > 2
> >> + mova m2, [in0q+in2q]
> >> + mova m3, [in0q+in3q]
> >> +%if %1 > 4
> >> + mova m4, [in0q+in4q]
> >> + mova m5, [in0q+in5q]
> >> +%endif
> >> +%endif
> >> + pslld m0, m%2
> >> + pslld m1, m%2
> >> +%if %1 > 2
> >> + pslld m2, m%2
> >> + pslld m3, m%2
> >> +%if %1 > 4
> >> + pslld m4, m%2
> >> + pslld m5, m%2
> >> +%endif
> >> +%endif
> >
> > Can't you do something like this? (untested)
> > pslld m0, [in0q], m%2
> > %assign i 0
> > %rep %1
> > pslld m%i, [in0q+in%iq], m%2
> > %assign i i+1
> > %endrep
>
> YASM libavcodec/x86/flacdsp.o
> D:/MinGW/msys/1.0/ffmpeg/src/libavcodec/x86/flacdsp.asm:271: error: undefined symbol `m' (first use)
> D:/MinGW/msys/1.0/ffmpeg/src/libavcodec/x86/flacdsp.asm:271: error: undefined symbol `i' (first use)
> D:/MinGW/msys/1.0/ffmpeg/src/libavcodec/x86/flacdsp.asm:271: error: undefined symbol `in' (first use)
> D:/MinGW/msys/1.0/ffmpeg/src/libavcodec/x86/flacdsp.asm:271: error: undefined symbol `iq' (first use)
> D:/MinGW/msys/1.0/ffmpeg/src/libavcodec/x86/flacdsp.asm:271: error: (Each undefined symbol is reported only once.)
> make: *** [libavcodec/x86/flacdsp.o] Error 1
>
> A %rep like that is only four lines shorter. Do you consider it more readable than the alternative to justify trying
> to get it working?
Totally up to you, it looked easier to maintain and obvious than several
nested ifdefery.
--
Clément B.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 473 bytes
Desc: not available
URL: <https://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20141102/5f919ddc/attachment.asc>
More information about the ffmpeg-devel
mailing list