[FFmpeg-devel] [ffmpeg-devel][PATCH] OpenHEVC new MC with ASM

Ronald S. Bultje rsbultje at gmail.com
Tue May 6 21:53:48 CEST 2014


Hi,

On Tue, May 6, 2014 at 3:43 PM, James Almer <jamrial at gmail.com> wrote:

> On 06/05/14 1:34 PM, Michael Niedermayer wrote:
> > On Tue, May 06, 2014 at 01:04:35PM -0300, James Almer wrote:
> >> On 29/04/14 12:09 PM, Pierre Edouard Lepere wrote:
> >>> Hello,
> >>> here is a patch submission changing the way MC is done : 4 and 8 tap
> filters now do the weighting too.
> >>> x86 ASM is also added in the second file.
> >>>
> >>> Best Regards,
> >>> Pierre-Edouard Lepere
> >>
> >>> +void ff_hevcdsp_init_x86(HEVCDSPContext *c, const int bit_depth)
> >>> +{
> >>> +    int mm_flags = av_get_cpu_flags();
> >>> +
> >>> +    if (bit_depth == 8) {
> >>> +        if (EXTERNAL_MMX(mm_flags)) {
> >>> +
> >>> +            if (EXTERNAL_MMXEXT(mm_flags)) {
> >>> +
> >>> +                if (EXTERNAL_SSSE3(mm_flags) && ARCH_X86_64) {
> >>
> >> The asm functions and the prototypes are all SSE4, yet you're checking
> for SSSE3 support
> >> at runtime here.
> >> This will crash on CPUs like first gen Core 2, Atom and AMD Bobcat.
> >>
> >> Also, there's no need to check for MMX and MMXEXT in a chain like this.
> EXTERNAL_SSSE3()
> >> is enough.
> >>
> >>> +
> >>> +                    EPEL_LINKS(c->put_hevc_epel, 0, 0, pel_pixels,
>  8);
> >>> +                    EPEL_LINKS(c->put_hevc_epel, 0, 1, epel_h,
>  8);
> >>> +                    EPEL_LINKS(c->put_hevc_epel, 1, 0, epel_v,
>  8);
> >>> +                    EPEL_LINKS(c->put_hevc_epel, 1, 1, epel_hv,
> 8);
> >>> +
> >>> +                    QPEL_LINKS(c->put_hevc_qpel, 0, 0, pel_pixels, 8);
> >>> +                    QPEL_LINKS(c->put_hevc_qpel, 0, 1, qpel_h,     8);
> >>> +                    QPEL_LINKS(c->put_hevc_qpel, 1, 0, qpel_v,     8);
> >>> +                    QPEL_LINKS(c->put_hevc_qpel, 1, 1, qpel_hv,    8);
> >>> +
> >>> +                }
> >>> +            }
> >>> +        }
> >>> +    } else if (bit_depth == 10) {
> >>> +        if (EXTERNAL_MMX(mm_flags)) {
> >>> +            if (EXTERNAL_MMXEXT(mm_flags) && ARCH_X86_64) {
> >>> +
> >>> +                if (EXTERNAL_SSSE3(mm_flags)) {
> >>
> >> Same as above.
> >
> > fixed (was easier for me to fix instead of waiting for a new patch
> > as i already had fixed 2 other issues locally ...)
> >
> > thx
> >
>
> Seems to break fate for msvc x64 and mingw-w64 (probably also icl).
>
> http://fate.ffmpeg.org/report.cgi?time=20140506174737&slot=x86_64-msvc12-windows-native
>
> http://fate.ffmpeg.org/report.cgi?time=20140506193115&slot=x86_64-mingw-w64-windows-native
>
> Linux gcc and clang seem unaffected so far, but in a couple hours
> something might show up.


That's usually a sign that we reported use of xmm regs uncorrectly in
cglobal, which causes xmm reg clobbers.

Ronald


More information about the ffmpeg-devel mailing list