[FFmpeg-devel] [PATCH] Make MMX2 put_no_rnd_pixels _x2 and _y2 bitexact

David Conrad lessen42
Fri May 28 23:49:30 CEST 2010


Hi,

The mmx2/3dnow put_no_rnd functions don't always round correctly, since they compensate for the +1 in pavgb by subtracting 1 from one of the inputs. This causes our theora decoder to not be bitexact to libtheora, though I haven't found any real source where the error accumulates enough to be visible.

This fixes it by using the property that (a+b)>>1 is equivalent to ~(~a+~b+1)>>1. This makes these functions 5 cycles slower on my penryn, but on my atom the additional instructions appear to be free probably due to load stalls.

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: put_no_rnd-exact.txt
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20100528/761067ac/attachment.txt>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: bench.txt
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20100528/761067ac/attachment-0001.txt>



More information about the ffmpeg-devel mailing list