[FFmpeg-devel] [PATCH]Fix h264 decoding for icc 32 bit on sse2 cpu

Carl Eugen Hoyos cehoyos
Sun Dec 28 22:45:30 CET 2008


Hi!

Michael Niedermayer <michaelni <at> gmx.at> writes:

> > Attached patch disables the x264 loop filter sse2 functions when compiling 
> > with 32 bit icc. This should fix the issues with FATE.
> 
> ok

I applied the patch. It fixes the h264 decoding issues with icc on FATE.
I wanted to test the speed impact, but I think my sample was not really suitable
(harry potter trailer 3a). I'm still posting the results, because a few things
were surprising for me: I didn't know 64 bit makes such a difference with gcc.

Carl Eugen

openSUSE 11.0, Intel E8400, mplayer was running while each test was made
ffmpeg -i harrypotterandthehalfbloodprince-tlr3a_h1080p.mov -f crc -
gcc version 4.3.1 20080507 (prerelease)

icc 10.1 64bit, --cpu=core2
real    0m36.019s
user    0m35.794s
sys     0m0.160s

real    0m35.995s
user    0m35.814s
sys     0m0.184s

real    0m35.874s
user    0m35.730s
sys     0m0.144s

real    0m35.877s
user    0m35.718s
sys     0m0.156s

real    0m36.031s
user    0m35.818s
sys     0m0.212s


icc 11.0 32bit, --cpu=core2
real    0m39.028s
user    0m36.686s
sys     0m2.300s

real    0m39.254s
user    0m36.862s
sys     0m2.376s

real    0m38.926s
user    0m36.466s
sys     0m2.440s

real    0m39.095s
user    0m36.670s
sys     0m2.380s

real    0m39.044s
user    0m36.622s
sys     0m2.388s

icc 10.1 32bit, --cpu=core2
real    0m39.813s
user    0m37.246s
sys     0m2.568s

real    0m42.353s
user    0m39.838s
sys     0m2.488s

real    0m39.834s
user    0m37.206s
sys     0m2.596s

real    0m39.742s
user    0m37.302s
sys     0m2.432s

real    0m40.025s
user    0m37.582s
sys     0m2.436s

gcc unpatched 64 bit, --cpu=core2
real    0m36.113s
user    0m35.134s
sys     0m0.980s

real    0m36.075s
user    0m35.074s
sys     0m0.980s

real    0m36.044s
user    0m35.150s
sys     0m0.876s

real    0m36.170s
user    0m35.258s
sys     0m0.896s

real    0m36.051s
user    0m35.854s
sys     0m0.188s

gcc unpatched 64 bit
real    0m36.757s
user    0m36.602s
sys     0m0.124s

real    0m36.581s
user    0m36.426s
sys     0m0.132s

real    0m36.529s
user    0m36.354s
sys     0m0.156s

real    0m36.604s
user    0m36.434s
sys     0m0.144s

real    0m36.815s
user    0m36.282s
sys     0m0.220s

gcc 64 bit, x264 sse2 functions disabled, --cpu=core2
real    0m36.036s
user    0m35.894s
sys     0m0.140s

real    0m36.041s
user    0m35.786s
sys     0m0.236s

real    0m36.253s
user    0m35.962s
sys     0m0.208s

real    0m36.223s
user    0m35.978s
sys     0m0.212s

real    0m36.183s
user    0m36.038s
sys     0m0.104s

gcc 64 bit,  x264 sse2 functions disabled
real    0m36.508s
user    0m36.318s
sys     0m0.164s

real    0m36.658s
user    0m36.490s
sys     0m0.148s

real    0m36.639s
user    0m36.414s
sys     0m0.208s

real    0m36.792s
user    0m36.650s
sys     0m0.124s

real    0m37.556s
user    0m37.086s
sys     0m0.284s

gcc 32bit x264 sse2 functions disabled,
cpu=core2 --extra-cflags=-m32 --extra-ldflags=-m32
real    0m46.913s
user    0m44.387s
sys     0m2.492s

real    0m46.828s
user    0m44.319s
sys     0m2.508s

real    0m47.033s
user    0m44.067s
sys     0m2.908s

real    0m46.802s
user    0m44.127s
sys     0m2.636s

real    0m46.943s
user    0m43.427s
sys     0m2.856s

gcc unpatched 32bit, --cpu=core2 --extra-cflags=-m32 --extra-ldflags=-m32
real    0m46.884s
user    0m44.419s
sys     0m2.432s

real    0m46.812s
user    0m44.207s
sys     0m2.608s

real    0m47.145s
user    0m44.435s
sys     0m2.488s

real    0m47.208s
user    0m44.463s
sys     0m2.676s

real    0m47.454s
user    0m44.791s
sys     0m2.640s





More information about the ffmpeg-devel mailing list