[FFmpeg-devel] [PATCH] VP8 MMX optimizations (MC and IDCT dc_add)

Jason Garrett-Glaser darkshikari
Wed Jun 23 03:50:16 CEST 2010


On Tue, Jun 22, 2010 at 4:31 PM, Jason Garrett-Glaser
<darkshikari at gmail.com> wrote:
> On Tue, Jun 22, 2010 at 4:05 PM, Jason Garrett-Glaser
> <darkshikari at gmail.com> wrote:
>> On Tue, Jun 22, 2010 at 12:35 PM, Ronald S. Bultje <rsbultje at gmail.com> wrote:
>>> Hi,
>>>
>>> as per $subj.
>>>
>>> Speed gain:
>>> - dc_add goes from 1800 to 1350 cycles (where 1150 is overhead,
>>> measured as empty asm func), so about 3-3.5x faster.
>>> - The MC functions are each about 4-5x faster (I only measured the 4x4
>>> ones, the rest I assume are similarly faster but not measured).
>>> - Total time spent on a shell-script that decodes the whole testsuite
>>> (vp8-test-vectors-r1, file 001-017) including shell overhead and
>>> everything goers from 2.3 to 2.1 seconds with these applied.
>>>
>>> Results are bit-identical, and this is my first MMX/etc. ever! Thanks
>>> to Jason for teaching me. ;-).
>>>
>>> Ronald
>>
>> New patch attached.
>>
>> Jason
>>
>
> Now with SSE2 v-filter motion compensation.
>
> Jason
>

Now with full SSE2 MC.  I also went and updated the x264asm headers
(and associated asm) to the latest versions.  This will be split in
the real commit.

Jason
-------------- next part --------------
A non-text attachment was scrubbed...
Name: vp8_asm.diff
Type: application/octet-stream
Size: 45171 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20100622/f340b8f3/attachment.obj>



More information about the ffmpeg-devel mailing list